[go: up one dir, main page]

HK1211399B - Video encoding apparatus and video decoding apparatus - Google Patents

Video encoding apparatus and video decoding apparatus Download PDF

Info

Publication number
HK1211399B
HK1211399B HK15111995.5A HK15111995A HK1211399B HK 1211399 B HK1211399 B HK 1211399B HK 15111995 A HK15111995 A HK 15111995A HK 1211399 B HK1211399 B HK 1211399B
Authority
HK
Hong Kong
Prior art keywords
transform
partition
transformation
frequency
unit
Prior art date
Application number
HK15111995.5A
Other languages
Chinese (zh)
Other versions
HK1211399A1 (en
Inventor
山本智幸
猪饲知宏
Original Assignee
夏普株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 夏普株式会社 filed Critical 夏普株式会社
Publication of HK1211399A1 publication Critical patent/HK1211399A1/en
Publication of HK1211399B publication Critical patent/HK1211399B/en

Links

Description

运动图像编码装置以及运动图像解码装置Moving picture encoding device and moving picture decoding device

本申请是申请号为“201080015255.5”,申请日为2010年3月17日,发明名称为“运动图像编码装置以及运动图像解码装置”之申请的分案申请。This application is a divisional application of the application with application number "201080015255.5", application date of March 17, 2010, and invention name of "Motion image encoding device and motion image decoding device".

技术领域Technical Field

本发明涉及对运动图像进行编码从而生成编码数据的运动图像编码装置、以及根据传输并蓄积的运动图像的编码数据来再现运动图像的运动图像解码装置。The present invention relates to a moving picture encoding device that encodes a moving picture to generate encoded data, and a moving picture decoding device that reproduces the moving picture based on the transmitted and stored encoded data of the moving picture.

背景技术Background Art

<导入和基本术语的定义><Introduction and Definition of Basic Terms>

在基于块的运动图像编码方式中,作为编码对象的输入运动图像被分割为称为宏块(以下记作MB)的给定处理单位,按每MB进行编码处理,从而生成编码数据。在再现运动图像时,以MB单位处理作为解码对象的编码数据来进行解码,从而生成解码图像。In block-based moving image coding, the input moving image to be coded is divided into predetermined processing units called macroblocks (hereinafter referred to as MBs). Coding is performed on each MB to generate coded data. When reproducing the moving image, the coded data to be decoded is processed and decoded on an MB-by-MB basis to generate a decoded image.

作为当前广泛普及的基于块的运动图像编码方式,存在由非专利文献1所规定的方式(H.264/AVC(Advanced Video Coding,高级视频编码))。在H.264/AVC中,生成对被分割为MB单位的输入运动图像进行估计的预测图像,并计算输入运动图像与预测图像的差分即预测残差。对所得到的预测残差,应用以离散余弦变换(DCT)为代表的频率变换,导出变换系数。利用被称为CABAC(Context-based Adaptive Binary Arithmetic Coding,基于上下文的自适应二进制算术编码)或CAVLC(Context-based Adaptive Variable LengthCoding,基于上下文自适应的可变长编码)的方法,对所导出的变换系数进行可变长编码。另外,预测图像通过利用运动图像的空间相关性的帧内预测、或者利用运动图像的时间相关性的帧间预测(动态补偿预测)而生成。As a block-based motion picture coding method that is currently widely used, there is a method specified by non-patent document 1 (H.264/AVC (Advanced Video Coding)). In H.264/AVC, a predicted image is generated to estimate the input motion picture divided into MB units, and the difference between the input motion picture and the predicted image, that is, the prediction residual, is calculated. A frequency transform represented by discrete cosine transform (DCT) is applied to the obtained prediction residual to derive the transform coefficient. The derived transform coefficient is variable-length encoded using a method called CABAC (Context-based Adaptive Binary Arithmetic Coding) or CAVLC (Context-based Adaptive Variable Length Coding). In addition, the predicted image is generated by intra-frame prediction using the spatial correlation of the motion picture, or inter-frame prediction (motion compensation prediction) using the temporal correlation of the motion picture.

<分区(partition)的概念及其效果><Partition Concept and Effects>

在帧间预测中,以被称为分区的单位生成与编码对象MB的输入运动图像近似的图像。在各分区中,与一个或两个运动向量建立对应。基于所述运动向量,在帧存储器中所记录的局部解码图像上参考与编码对象MB对应的区域,由此生成预测图像。另外,此时所参考的局部解码图像被称为参考图像。在H.264/AVC中,以像素单位可以利用16×16、16×8、8×16、8×8、8×4、4×8、4×4的分区尺寸。若利用小的分区尺寸,则可以用细小的单位来指定运动向量从而生成预测图像,所以即使在运动的空间相关性小的情况下,也可以生成接近输入运动图像的预测图像。另一方面,若利用大的分区尺寸,则在运动的空间相关性大的情况下,能够降低运动向量的编码所需要的码量。In inter-frame prediction, an image that is similar to the input motion image of the encoding object MB is generated in units called partitions. In each partition, a correspondence is established with one or two motion vectors. Based on the motion vector, a predicted image is generated by referencing the area corresponding to the encoding object MB on the local decoded image recorded in the frame memory. In addition, the local decoded image referenced at this time is called a reference image. In H.264/AVC, partition sizes of 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4 can be used in pixel units. If a small partition size is used, the motion vector can be specified in a small unit to generate a predicted image, so even when the spatial correlation of the motion is small, a predicted image close to the input motion image can be generated. On the other hand, if a large partition size is used, the amount of code required for encoding the motion vector can be reduced when the spatial correlation of the motion is large.

<变换尺寸的概念及其效果><The concept and effect of resizing>

在使用预测图像生成的预测残差中,削减了输入运动图像的像素值的空间或时间的冗长性。而且,通过对预测残差应用DCT,使能量集中在变换系数的低频成分。因此,通过利用该能量的偏颇来执行可变长编码,与不利用预测图像、DCT的情况相比,可以削减编码数据的码量。The prediction residual generated using a predicted image reduces spatial and temporal redundancy in the pixel values of the input moving image. Furthermore, applying DCT to the prediction residual concentrates energy in the low-frequency components of the transform coefficients. Therefore, by utilizing this energy bias to perform variable-length coding, the amount of encoded data can be reduced compared to when using neither a predicted image nor DCT.

在H.264/AVC中,为了增加DCT导致的向低频成分的能量集中性,采用从多种变换尺寸的DCT中选择适于运动图像的局部性质的DCT的方式(块自适应变换选择)。例如,在通过帧间预测生成预测图像的情况下,可以从8×8DCT和4×4DCT这两种DCT中选择适于预测残差的变换的DCT。8×8DCT可以在较宽的范围利用像素值的空间相关性,所以对于高频成分比较少的平坦区域有效。另一方面,4×4DCT对于包含物体的轮廓那样的高频成分多的区域有效。在H.264/AVC中,可以说8×8DCT是大变换尺寸的DCT,4×4DCT是小变换尺寸的DCT。In H.264/AVC, in order to increase the energy concentration towards low-frequency components caused by DCT, a method (block adaptive transform selection) is adopted to select a DCT suitable for the local properties of the moving image from DCTs of multiple transform sizes. For example, when generating a predicted image through inter-frame prediction, a DCT suitable for transforming the prediction residual can be selected from two DCTs, 8×8 DCT and 4×4 DCT. The 8×8 DCT can utilize the spatial correlation of pixel values in a wider range, so it is effective for flat areas with relatively few high-frequency components. On the other hand, the 4×4 DCT is effective for areas with many high-frequency components, such as the outlines of objects. In H.264/AVC, it can be said that the 8×8 DCT is a DCT with a large transform size, and the 4×4 DCT is a DCT with a small transform size.

另外,在H.264/AVC中,在分区的面积为8×8像素以上时可以选择8×8DCT和4×4DCT。此外,在分区的面积不足8×8像素时可以选择4×4DCT。In H.264/AVC, 8×8 DCT and 4×4 DCT can be selected when the area of a partition is 8×8 pixels or larger, and 4×4 DCT can be selected when the area of a partition is less than 8×8 pixels.

如上所述,H.264/AVC,可以根据作为运动图像的局部性质的像素值的空间相关性、运动向量的空间相关性的高低来选择适合的分区尺寸、变换尺寸,所以可以削减编码数据的码量。As described above, H.264/AVC can select appropriate partition sizes and transform sizes based on the spatial correlation of pixel values and motion vectors, which are local properties of a moving image, thereby reducing the amount of code required for encoding data.

<自适应变换尺寸扩展和分区尺寸扩展的说明><Description of Adaptive Transform Size Expansion and Partition Size Expansion>

近年来,具有HD(1920像素×1080像素)以上的分辨率的高清晰度运动图像正在增加。对于高清晰度运动图像,与现有低分辨率的运动图像的情况相比,在运动图像内的局部区域中,运动图像上的像素值的空间相关性、运动向量的空间相关性的能取得的范围较宽。尤其关于像素值和运动向量这双方,高清晰度运动图像具有局部区域中的空间相关性高的情况多这样的性质。In recent years, the use of high-definition moving images with resolutions exceeding HD (1920 x 1080 pixels) has been increasing. Compared to conventional low-resolution moving images, high-definition moving images have a wider range of spatial correlations between pixel values and motion vectors within local regions within the moving image. In particular, high-definition moving images often exhibit high spatial correlations between pixel values and motion vectors within local regions.

在非专利文献2中记载了如下的运动图像编码方式:通过扩展H.264/AVC的分区尺寸、变换尺寸,利用上述那样的高清晰度运动图像的空间相关性的性质,来削减编码数据的码量。Non-Patent Document 2 describes a moving picture coding method that reduces the amount of coded data by expanding the partition size and transform size of H.264/AVC and utilizing the spatial correlation properties of high-definition moving pictures as described above.

具体而言,作为分区尺寸,除了由H.264/AVC规定的分区尺寸之外,还增加了64×64、64×32、32×64、32×32、32×16、16×32的分区尺寸。而且,作为DCT,除了由H.264/AVC规定的DCT之外,还增加了具有16×16DCT、16×8DCT、8×16DCT这三种新的变换尺寸的DCT。Specifically, in addition to the partition sizes specified in H.264/AVC, partition sizes of 64×64, 64×32, 32×64, 32×32, 32×16, and 16×32 have been added. Furthermore, in addition to the DCT specified in H.264/AVC, three new DCT transform sizes, 16×16 DCT, 16×8 DCT, and 8×16 DCT, have been added.

在分区的面积为16×16像素以上时,可以选择16×16DCT、8×8DCT、4×4DCT。此外,在分区尺寸为16×8时,可以选择16×8DCT、8×8DCT、4×4DCT。在分区尺寸为8×16时,可以选择8×16DCT、8×8DCT、4×4DCT。在分区尺寸为8×8时,可以选择8×8DCT、4×4DCT。在分区的面积不足8×8像素时,可以选择4×4DCT。When the partition size is 16×16 pixels or larger, 16×16 DCT, 8×8 DCT, and 4×4 DCT can be selected. Furthermore, when the partition size is 16×8, 16×8 DCT, 8×8 DCT, and 4×4 DCT can be selected. When the partition size is 8×16, 8×16 DCT, 8×8 DCT, and 4×4 DCT can be selected. When the partition size is 8×8, 8×8 DCT and 4×4 DCT can be selected. When the partition size is less than 8×8 pixels, 4×4 DCT can be selected.

在非专利文献2记载的方式中,通过切换上述那样的多种多样的分区尺寸、变换尺寸,即使针对像素或运动向量的空间相关性的动态范围比较宽的高清晰度运动图像,也能够选择适于运动图像的局部性质的分区尺寸、变换尺寸,所以能够削减编码数据的码量。In the method described in non-patent document 2, by switching between various partition sizes and transform sizes as described above, even for high-definition motion images with a relatively wide dynamic range of spatial correlation of pixels or motion vectors, it is possible to select partition sizes and transform sizes that are suitable for the local properties of the motion image, thereby reducing the amount of code in the encoded data.

现有技术文献Prior art literature

非专利文献Non-patent literature

非专利文献1:ITU-T Recommendation H.264(11/07)Non-Patent Document 1: ITU-T Recommendation H.264 (November 7)

非专利文献2:ITU-T T09-SG16-C-0123Non-Patent Document 2: ITU-T T09-SG16-C-0123

发明的概要Summary of the Invention

发明要解决的课题Problems to be solved by the invention

如上述说明的那样,在运动图像编码方式中,使能够选择的分区尺寸、变换尺寸的种类增加,有利于编码数据的码量削减。但是,产生了如下新的问题:在运动图像上的各局部区域中,为了选择解码时所适用的分区尺寸、变换尺寸而需要的附加信息的码量增加。As described above, in moving picture coding methods, increasing the number of selectable partition sizes and transform sizes contributes to reducing the amount of encoded data. However, this creates a new problem: the amount of additional information required to select the partition size and transform size to be applied during decoding in each local area of the moving picture increases.

在非专利文献1以及非专利文献2中,在分区尺寸较大的情况下也可以利用小变换尺寸的频率变换(4×4DCT)。但是,大的分区在像素值或运动向量的空间相关性高的区域容易被选择,所以在对于那样的分区来应用小变换尺寸的频率变换时,与应用大变换尺寸的频率变换的情况相比,难以将预测残差的能量集中到更少数的变换系数。因此,基本不选择小变换尺寸的频率变换,选择变换尺寸所需的附加信息成为浪费。尤其在最大分区尺寸被扩大从而扩大了大分区尺寸与小变换尺寸的大小差异的情况下,变得更加难以选择小的变换尺寸。In non-patent documents 1 and 2, frequency transforms with small transform sizes (4×4 DCT) can be used even when the partition size is large. However, large partitions are more likely to be selected in areas with high spatial correlation between pixel values or motion vectors. Therefore, when applying a frequency transform with a small transform size to such partitions, it is more difficult to concentrate the energy of the prediction residual into fewer transform coefficients than when applying a frequency transform with a large transform size. Therefore, frequency transforms with small transform sizes are rarely selected, and the additional information required to select the transform size is wasted. In particular, when the maximum partition size is increased, thereby widening the size difference between the large partition size and the small transform size, it becomes even more difficult to select a small transform size.

此外,在非专利文献2中,对长方形的分区可以选择相同大小的变换尺寸的频率变换,但是没有提到在进一步增加了变换尺寸的种类的情况下,用怎样的基准来决定能够选择的变换尺寸。Furthermore, in Non-Patent Document 2, frequency transforms of the same transform size can be selected for rectangular partitions. However, there is no mention of what criteria are used to determine the selectable transform sizes when the types of transform sizes are further increased.

发明内容Summary of the Invention

本发明鉴于这种状况而作,目的在于提供一种运动图像编码装置,在运动图像编码装置中可以利用多种多样的分区尺寸、变换尺寸的情况下,维持能够选择适于运动图像的局部特性的分区尺寸、变换尺寸的可能性的同时,可以减少附加信息的码量。此外,提供一种能够对由所述运动图像编码装置进行了编码的编码数据进行解码的运动图像解码装置。The present invention has been developed in view of this situation, and its object is to provide a moving picture coding apparatus that can utilize a variety of partition sizes and transform sizes, while maintaining the ability to select a partition size or transform size appropriate for local characteristics of a moving picture, while reducing the amount of code for additional information. Furthermore, a moving picture decoding apparatus is provided that can decode coded data coded by the moving picture coding apparatus.

用于解决课题的手段Means for solving problems

为了解决上述那样的课题,本发明的第一技术手段,在将输入运动图像分割为给定大小的块,并按块单位进行编码处理的运动图像编码装置中,具备:预测参数决定部,其决定块的分区结构;预测图像生成部,其以由分区结构所规定的分割为单位来生成预测图像;变换系数生成部,其对预测图像与输入运动图像的差分即预测残差应用在给定变换预置集之中包含的变换的任一种;变换候补导出部,其根据分区形状信息来生成能够应用的变换的列表即变换候补列表;频率变换决定部,其根据变换候补列表,决定表示要对各个分区应用的变换的变换选择标志;和可变长编码部,其根据变换候补列表和变换预置集,对变换选择标志进行可变长编码。In order to solve the above-mentioned problems, the first technical means of the present invention is to divide an input moving image into blocks of a given size and perform encoding processing on a block-by-block basis, comprising: a prediction parameter determination unit that determines the partition structure of the block; a prediction image generation unit that generates a prediction image in units of the partition specified by the partition structure; a transform coefficient generation unit that applies any one of the transforms included in a given transform preset set to the difference between the prediction image and the input moving image, that is, the prediction residual; a transform candidate derivation unit that generates a list of applicable transforms, that is, a transform candidate list, based on partition shape information; a frequency transform determination unit that determines a transform selection flag indicating the transform to be applied to each partition based on the transform candidate list; and a variable length encoding unit that performs variable length encoding on the transform selection flag based on the transform candidate list and the transform preset set.

第二技术手段在第一技术手段中,还包括变换制约导出部,变换制约导出部根据分区形状信息,生成对各个分区不能应用的变换的列表即禁止变换列表,变换候补列表根据禁止变换列表和变换预置集而导出。The second technical means further includes a transformation constraint derivation unit in the first technical means, which generates a list of transformations that cannot be applied to each partition, i.e., a prohibited transformation list, based on the partition shape information. The transformation candidate list is derived based on the prohibited transformation list and the transformation preset set.

第三技术手段在第一或者第二技术手段中,分区形状信息是分区的纵向长度与横向长度的比、或者分区的纵向长度与横向长度的大小关系。Third Technical Means In the first or second technical means, the partition shape information is a ratio of a vertical length to a horizontal length of the partition, or a size relationship between the vertical length and the horizontal length of the partition.

第四技术手段在第一或者第二技术手段中,分区形状信息是分区的纵向长度和横向长度的最小值。Fourth Technical Means In the first or second technical means, the partition shape information is the minimum value of the vertical length and the horizontal length of the partition.

第五技术手段在第一或者第二技术手段中,分区结构通过阶层结构来表现,并且规定各分区按照其形状而包含在任一阶层中,分区形状信息是分区所属的阶层。A fifth technical means is that in the first or second technical means, the partition structure is expressed as a hierarchical structure, and each partition is defined to be included in a certain hierarchy according to its shape, and the partition shape information is the hierarchy to which the partition belongs.

第六技术手段在第一技术手段中,给定变换预置集包括至少一个以上的变换尺寸为纵1像素的横长方形的变换,变换候补列表生成部在分区的横向长度比纵向长度长时,将纵1像素的横长方形的变换尺寸的变换包含在变换候补列表中。The sixth technical means is that in the first technical means, a given transformation preset set includes at least one transformation whose transformation size is a horizontal rectangle of 1 pixel in the vertical direction, and when the horizontal length of the partition is longer than the vertical length, the transformation candidate list generation unit includes the transformation whose transformation size is a horizontal rectangle of 1 pixel in the vertical direction in the transformation candidate list.

第七技术手段在第二技术手段中,给定变换预置集包括至少一个变换尺寸为正方形的变换、至少一个变换尺寸为横长方形或者纵长方形的变换,变换制约导出部在分区的纵向长度与横向长度不一致时,将至少一个正方形的变换包含在禁止变换列表中。The seventh technical means is that in the second technical means, a given transformation preset set includes at least one transformation whose transformation size is a square and at least one transformation whose transformation size is a horizontal rectangle or a vertical rectangle. When the vertical length and horizontal length of the partition are inconsistent, the transformation constraint derivation unit includes at least one square transformation in the prohibited transformation list.

第八技术手段在第二技术手段中,给定变换预置集包括至少各一个变换尺寸为横长方形的变换以及变换尺寸为纵长方形的变换,变换制约导出部在分区的横向长度比纵向长度长时,将纵长方形的变换尺寸的变换包含在禁止变换列表中。The eighth technical means is that in the second technical means, a given transformation preset set includes at least one transformation with a transformation size of a horizontal rectangle and a transformation with a transformation size of a vertical rectangle. When the horizontal length of the partition is longer than the vertical length, the transformation constraint derivation unit includes the transformation with a transformation size of a vertical rectangle in the prohibited transformation list.

第九技术手段在第二技术手段中,给定变换预置集包括至少两个以上处于相互类似关系的变换尺寸的变换,变换制约导出部在分区的纵向长度和横向长度的最小值为给定阈值以上时,将具有处于类似关系的变换尺寸的变换中最小的变换尺寸的变换包含在禁止变换列表中。The ninth technical means is that in the second technical means, a given transformation preset set includes at least two transformations with transformation sizes that are in a similar relationship to each other, and when the minimum value of the longitudinal length and the lateral length of the partition is greater than a given threshold, the transformation constraint derivation unit includes the transformation with the smallest transformation size among the transformations with transformation sizes in a similar relationship in the prohibited transformation list.

第十技术手段在第一技术手段中,给定变换预置集包括第一变换和第二变换,第二变换与第一变换处于相似关系并且变换尺寸比第一变换的变换尺寸小,分区结构通过阶层结构来表现,并且规定各分区按照其形状而包含在任一阶层中,变换制约导出部,在分区属于不是最下位的给定阶层时,将第一变换包含在变换候补列表中并且不将第二变换包含在变换候补列表中;在分区属于比不是最下位的给定阶层下位的阶层时,将第二变换包含在变换候补列表中。The tenth technical means is that in the first technical means, a given transformation preset set includes a first transformation and a second transformation, the second transformation is in a similar relationship with the first transformation and the transformation size is smaller than the transformation size of the first transformation, the partition structure is expressed by a hierarchical structure, and it is stipulated that each partition is included in any hierarchy according to its shape, and the transformation constraint derivation unit includes the first transformation in the transformation candidate list and does not include the second transformation in the transformation candidate list when the partition belongs to a given hierarchy that is not the lowest; and includes the second transformation in the transformation candidate list when the partition belongs to a hierarchy lower than the given hierarchy that is not the lowest.

第十一技术手段在按块单位对输入编码数据进行解码处理的运动图像解码装置中,具备:可变长符号解码部,其根据输入编码数据对处理对象的块的分区结构进行解码;预测图像生成部,其以由分区结构所规定的分区为单位来生成预测图像;和变换控制导出部,其根据分区形状信息,导出能够应用的变换的变换制约和/或变换候补,其中,可变长解码部根据输入编码数据和变换制约和/或变换候补来解码变换选择标志,并且根据变换选择标志来解码处理对象的块的变换系数,运动图像解码装置还包括:预测残差重建部,其对变换系数应用与由变换选择标志所规定的变换对应的逆变换从而重建预测残差;和局部解码图像生成部,其根据预测图像和预测残差来输出与处理对象的块对应的解码图像数据。The eleventh technical means is provided in a motion picture decoding device that decodes input coded data in block units, comprising: a variable-length code decoding unit that decodes the partition structure of the block to be processed based on the input coded data; a prediction image generation unit that generates a prediction image in units of partitions specified by the partition structure; and a transform control derivation unit that derives transform constraints and/or transform candidates of applicable transforms based on partition shape information, wherein the variable-length decoding unit decodes the transform selection flag based on the input coded data and the transform constraints and/or transform candidates, and decodes the transform coefficients of the block to be processed based on the transform selection flag, and the motion picture decoding device also includes: a prediction residual reconstruction unit that applies an inverse transform corresponding to the transform specified by the transform selection flag to the transform coefficients to reconstruct the prediction residual; and a local decoded image generation unit that outputs decoded image data corresponding to the block to be processed based on the prediction image and the prediction residual.

第十二技术手段在按块单位对输入编码数据进行处理从而对图像进行解码的运动图像解码装置中,具备:可变长符号解码部,其解码用于决定处理对象的块的分区结构的预测参数、以及用于规定或更新对分区以及分区的集合能够应用的变换的规则;和预测图像生成部,其以由分区结构所规定的分割为单位来生成预测图像,其中,可变长解码部根据输入编码数据和规则来对变换选择标志进行解码,并且根据变换选择标志对处理对象的块的变换系数进行解码,运动图像解码装置还包括:预测残差重建部,其对变换系数应用与由变换选择标志所规定的变换对应的逆变换来重建预测残差;和局部解码图像生成部,其根据预测图像和预测残差来输出与处理对象的块对应的解码图像数据。The twelfth technical means is a motion picture decoding device that processes input coded data in block units to decode an image, comprising: a variable-length code decoding unit that decodes prediction parameters for determining a partition structure of a block to be processed, and a rule for specifying or updating transformations that can be applied to partitions and sets of partitions; and a prediction image generation unit that generates a prediction image based on a division specified by the partition structure, wherein the variable-length decoding unit decodes a transformation selection flag based on the input coded data and the rule, and decodes the transformation coefficients of the block to be processed based on the transformation selection flag, and the motion picture decoding device further comprises: a prediction residual reconstruction unit that applies an inverse transformation corresponding to the transformation specified by the transformation selection flag to the transformation coefficients to reconstruct a prediction residual; and a local decoded image generation unit that outputs decoded image data corresponding to the block to be processed based on the prediction image and the prediction residual.

第十三技术手段在第十二技术手段中,规则包括用于规定针对特定形状的分区将特定种类的变换包含在变换候补列表中的规则。Thirteenth Technical Means In the twelfth technical means, the rule includes a rule for stipulating that a specific type of transformation be included in the transformation candidate list for a partition of a specific shape.

第十四技术手段在第十二技术手段中,规则包括用于规定针对特定形状的分区禁止将特定种类的变换包含在变换候补列表中的规则。Fourteenth Technical Means In the twelfth technical means, the rule includes a rule for prohibiting a specific type of transformation from being included in the transformation candidate list for a partition of a specific shape.

第十五技术手段在第十二技术手段中,规则包括用于规定在针对特定形状的分区而将特定种类的变换包含在变换候补列表中时将该变换置换为其他变换的规则。Fifteenth Technical Means In the twelfth technical means, the rule includes a rule for prescribing that, when a specific type of transformation is included in the transformation candidate list for a partition of a specific shape, the transformation is replaced with another transformation.

第十六技术手段在第十二技术手段中,规则包括表现为对特定形状的分区许可或者禁止或者置换特定种类的变换的基础规则的组合的复合规则。Sixteenth Technical Means In the twelfth technical means, the rule includes a composite rule that is a combination of basic rules that permit or prohibit a partition of a specific shape or replace a specific type of transformation.

第十七技术手段在第十六技术手段中,规则包括用于规定对属于特定阶层的上位阶层的分区禁止将处于类似关系的变换中的小尺寸的变换包含在变换候补列表中的规则,来作为复合规则。Seventeenth Technical Means In the sixteenth technical means, the rule includes, as a composite rule, a rule for prohibiting, for a partition belonging to a higher layer than a specific layer, a small-size transformation from among transformations in a similar relationship from being included in the transformation candidate list.

第十八技术手段在第十六或者第十七技术手段中,可变长编码部对表示是否应用复合规则来作为规则的标志进行编码。Eighteenth Technical Means In the sixteenth or seventeenth technical means, the variable-length encoding unit encodes a flag indicating whether a composite rule is applied as a rule.

第十九技术手段在将输入运动图像分割为给定大小的块从而以块单位进行编码处理的运动图像编码装置中,具备:预测参数决定部,其决定块的分区结构;预测图像生成部,其以由分区结构所规定的分区为单位来生成预测图像;变换系数生成部,其对预测图像与输入运动图像的差分即预测残差应用在给定变换预置集之中包含的频率变换的任一种;变换控制导出部,其根据分区形状信息来导出对各个分区能够应用的变换的变换制约和/或变换候补;频率变换决定部,其根据变换制约和/或变换候补来决定表示要应用的变换的变换选择标志;变换候补导出规则决定部,其决定用于规定或者更新变换控制导出部中的变换制约和/或变换候补的导出方法的规则;和可变长编码部,其根据变换制约和/或变换候补列表和变换预置集,对变换选择标志进行可变长编码,并且按每个比块大的给定单位对变换候补导出规则进行可变长编码。The nineteenth technical means is provided in a motion picture encoding device that divides an input motion picture into blocks of a given size and performs encoding processing in units of blocks, comprising: a prediction parameter determination unit that determines a partition structure of the block; a prediction image generation unit that generates a prediction image in units of partitions specified by the partition structure; a transform coefficient generation unit that applies any one of the frequency transforms included in a given transform preset set to a difference between the prediction image and the input motion picture, i.e., a prediction residual; a transform control derivation unit that derives transform constraints and/or transform candidates for transforms that can be applied to each partition based on partition shape information; a frequency transform determination unit that determines a transform selection flag indicating a transform to be applied based on the transform constraints and/or transform candidates; a transform candidate derivation rule determination unit that determines a rule for specifying or updating a derivation method of transform constraints and/or transform candidates in the transform control derivation unit; and a variable length encoding unit that variable-length encodes the transform selection flag based on the transform constraints and/or the transform candidate list and the transform preset set, and variable-length encodes the transform candidate derivation rule for each given unit larger than a block.

发明的效果Effects of the Invention

在本发明的运动图像编码装置中,在选择特定的分区尺寸时,通过将可选择的变换尺寸限制在有效性高的变换尺寸,从而在较高地维持能够选择适于运动图像的局部特性的变换尺寸的可能性的同时,能够降低附加信息的码量,进而,能够削减编码处理的处理量。此外,在本发明的运动图像解码装置中,能够对由所述运动图像编码装置进行了编码的编码数据进行解码。In the moving picture coding apparatus of the present invention, when selecting a specific partition size, by limiting the selectable transform sizes to those with high effectiveness, the probability of selecting a transform size suitable for the local characteristics of the moving picture is maintained at a high level, while the amount of code required for the additional information can be reduced, thereby reducing the amount of encoding processing. Furthermore, the moving picture decoding apparatus of the present invention can decode coded data coded by the moving picture coding apparatus.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是用于说明扩展宏块(MB)和处理顺序的定义的图。FIG. 1 is a diagram for explaining the definition of an extended macroblock (MB) and a processing order.

图2是表示本发明的运动图像编码装置的一实施例的框图。FIG2 is a block diagram showing an embodiment of a moving picture encoding apparatus according to the present invention.

图3是用于说明分区阶层结构和处理顺序的定义的图。FIG. 3 is a diagram for explaining the definition of a partition hierarchy and a processing order.

图4是用于说明禁止变换列表的生成处理的一例的流程图。FIG. 4 is a flowchart for explaining an example of a process for generating a prohibited conversion list.

图5是用于说明禁止变换列表的生成处理的其他例的流程图。FIG. 5 is a flowchart for explaining another example of the process of generating the prohibited conversion list.

图6是用于对禁止变换列表的生成时的分区分割进行说明的图。FIG. 6 is a diagram for explaining partition division when generating a prohibited conversion list.

图7是用于对禁止变换列表的生成时的分区分割进行说明的其他图。FIG. 7 is another diagram for explaining partition division when generating a prohibited conversion list.

图8是用于说明禁止变换列表的生成处理的又一其他例的流程图。FIG8 is a flowchart for explaining still another example of the process of generating the prohibited conversion list.

图9是用于说明禁止变换列表的生成过程的具体例的图。FIG. 9 is a diagram for explaining a specific example of the process of generating the prohibited conversion list.

图10是用于说明变换选择标志的编码数据生成处理例的流程图。FIG10 is a flowchart for explaining an example of a coded data generation process for a conversion selection flag.

图11是表示本发明的运动图像解码装置的一实施例的框图。FIG11 is a block diagram showing an embodiment of a moving picture decoding apparatus according to the present invention.

图12是表示本发明的运动图像编码装置的其他实施例的框图。FIG12 is a block diagram showing another embodiment of the moving picture encoding apparatus according to the present invention.

图13是用于说明变换候补列表的生成处理的一例的流程图。FIG. 13 is a flowchart for explaining an example of a process for generating a conversion candidate list.

图14是表示本发明的运动图像解码装置的其他实施例的框图。FIG14 is a block diagram showing another embodiment of the moving picture decoding apparatus according to the present invention.

图15是表示本发明的运动图像编码装置的又一其他实施例的框图。FIG15 is a block diagram showing still another embodiment of the moving picture encoding apparatus according to the present invention.

图16是表示本发明的运动图像解码装置的又一其他实施例的框图。FIG16 is a block diagram showing still another embodiment of the moving picture decoding device according to the present invention.

具体实施方式DETAILED DESCRIPTION

(实施方式1)(Implementation 1)

以下,参照图1~图11来说明作为本发明的运动图像编码装置以及运动图像解码装置的一实施方式的运动图像编码装置10以及运动图像解码装置20。另外,在附图的说明中,对同一要素标注同一符号并省略说明。1 to 11 , a moving picture encoding device 10 and a moving picture decoding device 20 as one embodiment of the moving picture encoding device and the moving picture decoding device of the present invention are described. In the description of the drawings, the same elements are denoted by the same reference numerals and their description is omitted.

在以下的说明中,设在运动图像编码装置中以由64×64像素构成的扩展MB单位依次输入输入运动图像来执行处理。此外,对于扩展MB的输入顺序,假设图1所示的光栅扫描顺序。但是,本发明也可以适用于扩展MB的尺寸为上述以外的情况。尤其对于比当前正广泛利用的处理单位即16×16像素大的尺寸的扩展MB是有效的。In the following description, it is assumed that the moving picture encoding device sequentially inputs an input moving picture in extended MB units consisting of 64×64 pixels and performs processing. Furthermore, the order in which the extended MBs are input is assumed to be the raster scan order shown in Figure 1. However, the present invention is also applicable to extended MB sizes other than those described above. It is particularly effective for extended MBs larger than the currently widely used processing unit of 16×16 pixels.

以下的说明中的运动图像编码装置以及运动图像解码装置中的处理,设基于H.264/AVC来实现,对于动作,对于没有特别说明的部分,设遵循H.264/AVC的动作。但是,本发明不限定于H.264/AVC,也能够适用于类似的VC-1、MPEG-2、AVS等方式、以及采用块单位的处理、频率变换的其他运动图像编码方式。The processing in the video encoding and decoding devices described below is assumed to be implemented based on H.264/AVC. Unless otherwise specified, operations are assumed to comply with H.264/AVC. However, the present invention is not limited to H.264/AVC and is applicable to similar video encoding methods such as VC-1, MPEG-2, and AVS, as well as other video encoding methods that utilize block-based processing and frequency conversion.

<运动图像编码装置10的构成><Configuration of Moving Image Coding Device 10>

图2是表示运动图像编码装置10的构成的框图。运动图像编码装置10包括帧存储器101、预测参数决定部102、预测图像生成部103、变换制约导出部104、频率变换决定部105、预测残差生成部106、变换系数生成部107、可变长编码部108、预测残差重建部109、局部解码图像生成部110。FIG2 is a block diagram showing the configuration of the moving picture coding apparatus 10. The moving picture coding apparatus 10 includes a frame memory 101, a prediction parameter determination unit 102, a prediction image generation unit 103, a transform constraint derivation unit 104, a frequency transform determination unit 105, a prediction residual generation unit 106, a transform coefficient generation unit 107, a variable length coding unit 108, a prediction residual reconstruction unit 109, and a local decoded image generation unit 110.

<帧存储器101><Frame Memory 101>

在帧存储器101中记录局部解码图像。这里,所谓局部解码图像,是指针对通过对变换系数应用逆频率变换而重建的预测残差,叠加预测图像而生成的图像。在对输入运动图像的特定帧的特定扩展MB进行处理的时间点,将针对比处理对象帧先进行了编码的帧的局部解码图像、以及与比处理对象扩展MB先进行了编码的扩展MB对应的局部解码图像记录在帧存储器101中。另外,帧存储器101中所记录的局部解码图像,可以由装置内的各构成要素适当读出。Locally decoded images are stored in frame memory 101. Here, a locally decoded image is an image generated by superimposing a prediction image on a prediction residual reconstructed by applying an inverse frequency transform to transform coefficients. When processing a specific extended MB in a specific frame of an input moving image, locally decoded images corresponding to frames encoded before the current frame and locally decoded images corresponding to extended MBs encoded before the current extended MB are stored in frame memory 101. Furthermore, the locally decoded images stored in frame memory 101 can be read as appropriate by various components within the apparatus.

<预测参数决定部102(分区结构定义、模式判定说明)><Prediction Parameter Determination Unit 102 (Partition Structure Definition, Mode Judgment Description)>

预测参数决定部102基于输入运动图像的局部性质,来决定预测参数并输出。这里,在预测参数中,至少包括表示扩展MB内各部适用的分区的结构的分区结构、和用于帧间预测的运动信息(运动向量以及参考的局部解码图像的索引(参考图像索引))。此外,还可以包括表示帧内预测时的预测图像生成方法的帧内预测模式。The prediction parameter determination unit 102 determines and outputs prediction parameters based on the local properties of the input moving image. The prediction parameters include at least the partition structure indicating the structure of the partitions applicable to each part within the extended MB, and motion information used for inter-frame prediction (motion vectors and indices of local decoded images referenced (reference image indices)). Furthermore, the intra-frame prediction mode indicating the method for generating predicted images during intra-frame prediction may also be included.

对于分区结构的详细情况,参照图3来进行说明。分区结构通过阶层结构来表现,将以64×64像素为处理单位的阶层定义为阶层L0、将以32×32像素为处理单位的阶层定义为阶层L1、将以16×16像素为处理单位的阶层定义为阶层L2、将以8×8像素为处理单位的阶层定义为阶层L3。对于各阶层,作为分割方法可以选择不进行分区的一分割、由水平方向的直线将区域等分的水平二分割、由垂直方向的直线将区域等分的垂直二分割、由水平方向和垂直方向的2条直线将区域四等分的四分割中的任一个。另外,将处理单位大的阶层称为上位阶层、将处理单位小的阶层称为下位阶层。在本实施方式中,阶层L0是最上位阶层,阶层L3是最下位阶层。通过从最上位阶层即阶层L0依次确定各阶层中的分割方法来表现分区结构。具体而言,通过下面的过程能够唯一地表现分区结构。The details of the partition structure are described with reference to FIG3 . The partition structure is represented by a hierarchical structure, with the hierarchy using 64×64 pixels as the processing unit defined as hierarchy L0, the hierarchy using 32×32 pixels as the processing unit defined as hierarchy L1, the hierarchy using 16×16 pixels as the processing unit defined as hierarchy L2, and the hierarchy using 8×8 pixels as the processing unit defined as hierarchy L3. For each hierarchy, the division method can be selected from single division (no division), horizontal division (two equal divisions) using horizontal lines, vertical division (two equal divisions) using vertical lines, or four divisions (four equal divisions) using two horizontal and vertical lines. Hierarchies with larger processing units are referred to as upper hierarchies, and hierarchies with smaller processing units are referred to as lower hierarchies. In this embodiment, hierarchy L0 is the highest hierarchy, and hierarchy L3 is the lowest hierarchy. The partition structure is represented by determining the division method for each hierarchy, starting from the highest hierarchy, hierarchy L0. Specifically, the following process uniquely represents the partition structure.

(过程S10)若阶层L0中的分割方法是一分割、水平二分割、垂直二分割中的任一种,则将由该分割方法所表现的区域作为阶层L0中的处理单位的分区。在分割方法是四分割的情况下,对各分割区域,通过过程S11来决定分区。(Step S10) If the division method in level L0 is one of single division, horizontal two division, or vertical two division, the area represented by the division method is used as a partition of the processing unit in level L0. If the division method is four division, the partition is determined for each divided area in step S11.

(过程S11)若阶层L1中的分割方法是一分割、水平二分割、垂直二分割中的任一种,则将由该分割方法所表现的区域作为阶层L1中的处理单位的分区。在分割方法是四分割的情况下,对各分割区域,通过过程S12来决定分区。(Step S11) If the division method in layer L1 is one of single division, horizontal two division, or vertical two division, the area represented by the division method is used as a partition of the processing unit in layer L1. If the division method is four divisions, the partition is determined for each divided area in step S12.

(过程S12)若阶层L2中的分割方法是一分割、水平二分割、垂直二分割中的任一种,则将由该分割方法所表现的区域作为阶层L2中的处理单位的分区。在分割方法是四分割的情况下,对各分割区域,通过过程S13来决定分区。(Step S12) If the division method in level L2 is one of single division, horizontal two division, or vertical two division, the area represented by the division method is used as a partition of the processing unit in level L2. If the division method is four division, the partition is determined for each divided area in step S13.

(过程S13)将由阶层L3中的分割方法所表现的区域作为阶层L3中的处理单位的分区。(Step S13) The region represented by the division method in the layer L3 is partitioned as a processing unit in the layer L3.

这里,对扩展MB内的各分区的处理顺序进行说明。如图3所示,在各阶层中,不论分割方法都以光栅扫描顺序执行处理。但是,除了最下位的阶层(阶层L3)以外,在选择了四分割作为分割方法的情况下,对于各四分割区域以光栅扫描顺序处理由下位阶层表现的分区。在以后说明的各部分中,在处理扩展MB内的分区时,设适用上述处理顺序。Here, the processing order for each partition within an extended MB is described. As shown in Figure 3, processing is performed in raster scan order at each layer, regardless of the partitioning method. However, except for the lowest layer (layer L3), when a four-partition partitioning method is selected, the partitions represented by the lower layers are processed in raster scan order for each four-partition area. In the following sections, the above processing order is applied when processing partitions within an extended MB.

分区p所属于的阶层Lx通过下面的过程导出。The hierarchy Lx to which the partition p belongs is derived through the following process.

(过程S20)在分区p的尺寸与特定的阶层Ly的由一分割或者水平二分割或者垂直二分割所生成的分区的尺寸相等时,将Lx的值设定为Ly。(Process S20) When the size of the partition p is equal to the size of the partition generated by one division, horizontal two division, or vertical two division of the specific hierarchy Ly, the value of Lx is set to Ly.

(过程S21)在上述以外的情况下,将Lx的值设定为L3(将Lx设定为最下位阶层)。(Process S21) In cases other than the above, the value of Lx is set to L3 (Lx is set to the lowest level).

<分区形状的说明><Description of Partition Shape>

将对属于分区结构的各分区给予特征的信息,即,分区尺寸、或表示分区尺寸的特征的信息、或者、分区结构中的阶层称为分区形状信息。例如,32×32等的分区尺寸本身、表示是否比给定分区尺寸大的信息、分区的纵向长度与横向长度的比、分区的纵向长度与横向长度的大小关系、分区的纵向长度和横向长度的最小值或最大值、分区所属的阶层等,都是分区形状信息。Partition shape information refers to information that characterizes each partition in the partition structure, namely, the partition size, information that characterizes the partition size, or the hierarchy within the partition structure. For example, partition shape information includes the partition size itself, such as 32×32, information indicating whether it is larger than a given partition size, the ratio of the vertical to horizontal length of a partition, the relationship between the vertical and horizontal lengths of a partition, the minimum and maximum values of the vertical and horizontal lengths of a partition, and the hierarchy to which a partition belongs.

预测参数通过率失真判定来决定。在率失真判定中,针对各预测参数候补,根据使用该预测参数对处理对象扩展MB进行编码时的编码数据的码量、局部解码图像和输入运动图像的失真,来计算被称为率失真成本的成本,从而选择使该成本最小的预测参数。即,针对作为预测参数的分区结构、运动信息的可能的所有组合来计算率失真成本,并将最好的组合作为预测参数。将扩展MB的编码数据的码量设为R,将与扩展MB对应的输入运动图像和局部解码图像的均方误差设为D,使用表示码量R和误差D的关系的参数λ,通过C=D+λR的数式,可以计算率失真成本C。Prediction parameters are determined through rate-distortion analysis. In this analysis, a cost, called a rate-distortion cost, is calculated for each candidate prediction parameter based on the bitrate of the coded data, the distortion of the local decoded image, and the input moving image when encoding the target extended MB using that prediction parameter. The prediction parameter that minimizes this cost is then selected. Specifically, the rate-distortion cost is calculated for all possible combinations of the partition structure and motion information used as prediction parameters, and the best combination is selected as the prediction parameter. Let R be the bitrate of the coded data of the extended MB, and D be the mean squared error between the input moving image and the local decoded image corresponding to the extended MB. Using the parameter λ, which represents the relationship between the bitrate R and the error D, the rate-distortion cost C can be calculated using the equation C = D + λR.

通过率失真判定,决定并输出适合用于对处理对象扩展MB进行编码的预测参数,即,合适的分区结构、和与各分区对应的运动信息。Through rate-distortion determination, prediction parameters suitable for encoding the extended MB to be processed, that is, a suitable partition structure and motion information corresponding to each partition are determined and output.

另外,有可能在对特定的预测参数计算率失真成本时对处理对象扩展MB适用的频率变换不能被唯一决定。在这种情况下,可以利用应用特定的频率变换而得到的率失真成本,也可以利用全部应用多个频率变换时所得到的最小的率失真成本。Furthermore, when calculating the rate-distortion cost for specific prediction parameters, the frequency transform applied to the extended MB being processed may not be uniquely determined. In this case, the rate-distortion cost obtained by applying the specific frequency transform may be used, or the minimum rate-distortion cost obtained by applying multiple frequency transforms may be used.

<预测图像生成部103><Prediction Image Generator 103>

预测图像生成部103基于输入的预测参数来生成处理对象扩展MB的预测图像并输出。通过以下的过程来执行预测图像生成。The predicted image generation unit 103 generates and outputs a predicted image of the extended MB to be processed based on the input prediction parameters. The predicted image generation is performed through the following procedure.

(过程S30)基于预测参数所包含的分区结构,将扩展MB分割为分区,通过过程S31来生成各分区中的预测图像。(Process S30) Based on the partition structure included in the prediction parameters, the extended MB is divided into partitions, and a prediction image is generated in each partition through process S31.

(过程S31)从预测参数读出与处理对象的分区相对应的运动信息,即,运动向量和参考图像索引。在参考图像索引表示的局部解码图像上,基于运动向量表示的区域的像素值,通过动态补偿预测来生成预测图像。(Process S31) Motion information corresponding to the partition to be processed, i.e., the motion vector and reference image index, is read from the prediction parameters. A predicted image is generated by motion compensation prediction based on the pixel values of the area indicated by the motion vector on the local decoded image indicated by the reference image index.

<预测残差生成部106><Prediction Residual Generator 106>

预测残差生成部106基于被输入的输入运动图像和预测图像,生成扩展MB的预测残差并输出。预测残差是与扩展MB相同尺寸的二维数据,各要素成为输入运动图像和预测图像的对应的像素间的差分值。The prediction residual generation unit 106 generates and outputs a prediction residual for the extended MB based on the input moving picture and the predicted picture. The prediction residual is two-dimensional data of the same size as the extended MB, and each element is a difference value between corresponding pixels in the input moving picture and the predicted picture.

<变换系数生成部107><Transform Coefficient Generation Unit 107>

变换系数生成部107基于被输入的预测残差和变换选择标志,通过对预测残差进行频率变换从而生成变换系数并输出。变换选择标志表示对扩展MB的各分区要应用的频率变换,变换系数生成部107对扩展MB内的各分区选择由变换选择标志表示的频率变换,将所选择的频率变换应用于预测残差。由变换选择标志表示的频率变换,是能够由变换系数生成部107应用的全部频率变换的集合(变换预置集(preset))所包含的频率变换中的任一个。The transform coefficient generator 107 generates and outputs transform coefficients by performing a frequency transform on the prediction residual based on the input prediction residual and the transform selection flag. The transform selection flag indicates the frequency transform to be applied to each partition of the extended MB. The transform coefficient generator 107 selects the frequency transform indicated by the transform selection flag for each partition within the extended MB and applies the selected frequency transform to the prediction residual. The frequency transform indicated by the transform selection flag is any one of the frequency transforms included in the set of all frequency transforms applicable by the transform coefficient generator 107 (transform preset).

在本实施方式中的变换预置集之中包括4×4DCT、8×8DCT、16×16DCT、16×8DCT、8×16DCT、16×1DCT、1×16DCT、8×1DCT、以及1×8DCT这9种频率变换。这里所规定的各频率变换分别对应于特定的变换尺寸的DCT(离散余弦变换)(例如,4×4DCT对应于将4×4像素作为变换尺寸的离散余弦变换)。另外,本发明不限制于仅针对上述频率变换的组,对于上述变换预置集的子集也能够适合。此外,在变换预置集之中也可以包括其他变换尺寸的离散余弦变换、例如包括32×32DCT、64×64DCT的频率变换,还可以在变换预置集之中包括离散余弦变换以外的频率变换,例如包括阿达玛(Hadamard)变换、正弦变换、小波变换、或者与这些变换近似的变换的频率变换。In this embodiment, the transform preset set includes nine frequency transforms: 4×4 DCT, 8×8 DCT, 16×16 DCT, 16×8 DCT, 8×16 DCT, 16×1 DCT, 1×16 DCT, 8×1 DCT, and 1×8 DCT. Each frequency transform specified here corresponds to a DCT (discrete cosine transform) with a specific transform size (for example, 4×4 DCT corresponds to a discrete cosine transform with a transform size of 4×4 pixels). Furthermore, the present invention is not limited to the aforementioned set of frequency transforms and is also applicable to a subset of the aforementioned transform preset sets. Furthermore, the transform preset set may also include discrete cosine transforms with other transform sizes, such as frequency transforms including 32×32 DCT and 64×64 DCT. The transform preset set may also include frequency transforms other than discrete cosine transforms, such as Hadamard transforms, sine transforms, wavelet transforms, or frequency transforms similar to these transforms.

对M×N像素的分区应用具有W×H的变换尺寸的频率变换的处理,是用下面的伪代码表示的处理。另外,区域R(x、y、w、h)表示存在于以分区内左上角为起点向右方向位移x像素、向下方向位移y像素的位置处的宽w像素、高h像素的区域。The process of applying a frequency transform with a transform size of W×H to an M×N pixel partition is represented by the following pseudo code. Region R(x, y, w, h) represents a region with a width of w pixels and a height of h pixels, located at a position shifted x pixels to the right and y pixels downward from the upper left corner of the partition.

for(j=0、j<N、j+=H){for(j=0, j<N, j+=H){

for(i=0、i<M、i+=W){for(i=0, i<M, i+=W){

对区域R(i、j、W、H)应用频率变换Apply frequency transform to region R(i, j, W, H)

}}

}}

<变换制约导出部104><Conversion Constraint Derivation Unit 104>

变换制约导出部104基于被输入的预测参数,导出关于在扩展MB内的各分区中能够选择的频率变换的制约作为变换制约并输出。即,基于由预测参数所决定的各分区的分区形状信息来导出该分区的变换制约。The transform constraint deriving unit 104 derives and outputs the constraints on the frequency transform selectable in each partition within the extended MB based on the input prediction parameters. In other words, the transform constraints for each partition are derived based on the partition shape information of the partition determined by the prediction parameters.

变换制约,作为分别与扩展MB内的各分区建立了对应的禁止变换列表的集合来定义。这里,在禁止变换列表中,在要素中包括变换预置集所含的频率变换之中的、在建立了对应的分区中不能选择的频率变换(禁止频率变换)。换言之,从变换预置集的要素去除了禁止变换列表的要素,成为由建立了对应的分区能够选择的频率变换的集合(变换候补列表)。Transformation constraints are defined as a set of prohibited transformation lists associated with each partition within an extended MB. The prohibited transformation lists include frequency transformations included in the transformation preset set that cannot be selected in the associated partition (prohibited frequency transformations). In other words, the set of frequency transformations that can be selected by the associated partition (transformation candidate list) is formed by removing the elements of the prohibited transformation list from the elements of the transformation preset set.

另外,禁止变换列表、变换候补列表能够通过包含表示在集合中是否包含各变换的信息的变换集合信息来表现。设变换预置集之中所包含的变换的数量为Nt个时,变换的组合为2的Nt次方,所以利用具有0~(2的Nt次方-1)的值域的变换集合信息可以表现集合中所包含的变换。另外,不需要通过变换集合信息一定能够表现全部变换的组合,可以表现与特定的组合对应的值即可。作为极端的示例,在变换预置集仅包括4×4DCT以及8×8DCT时,通过表示是否禁止4×4DCT(或者8×8DCT)的1比特的标志就能够表现禁止列表。此外,还可以使4×4DCT与0对应、使4×4DCT和8×8DCT的组与1对应、使8×8DCT与2对应,从而通过0~2的值来表现变换候补列表。Furthermore, the prohibited transform list and candidate transform list can be represented using transform set information that includes information indicating whether each transform is included in the set. Assuming that the number of transforms included in a transform preset set is Nt, the number of transform combinations is 2 to the power of Nt. Therefore, transform set information with a value range of 0 to (2 to the power of Nt - 1) can be used to represent the transforms included in the set. Furthermore, the transform set information does not necessarily need to represent all transform combinations; it can simply represent values corresponding to specific combinations. As an extreme example, if the transform preset set only includes 4×4 DCT and 8×8 DCT, the prohibited list can be represented using a 1-bit flag indicating whether 4×4 DCT (or 8×8 DCT) is prohibited. Alternatively, the candidate transform list can be represented using values from 0 to 2, with 0 corresponding to 4×4 DCT, 1 corresponding to the combination of 4×4 DCT and 8×8 DCT, and 2 corresponding to 8×8 DCT.

此外,可以根据阶层、分区、块的组等的单位来改变变换集合信息的含义。即,相同的变换集合信息的值0可以在阶层L0中表示16×16DCT、在阶层L1中表示8×8DCT、在阶层L2中表示4×4DCT。通过改变变换集合信息的值的含义,可以用较少的值的范围来表现禁止变换列表以及变换候补列表。Furthermore, the meaning of transform set information can be changed based on the hierarchy, partition, or block group. For example, the same transform set information value of 0 can represent 16×16 DCT in hierarchy L0, 8×8 DCT in hierarchy L1, and 4×4 DCT in hierarchy L2. By changing the meaning of the transform set information values, the prohibited transform list and candidate transform list can be expressed using a narrower range of values.

因此,本发明中的变换制约、变换候补列表不受列表这一术语的约束,可以认为与表示变换制约、变换候补的变换集合信息等价。Therefore, the transformation constraints and the transformation candidate list in the present invention are not limited to the term "list" and can be considered to be equivalent to transformation set information indicating the transformation constraints and transformation candidates.

利用下面的过程来生成针对特定的分区p的禁止变换列表Lp。另外,设分区p的大小为M×N像素(横M像素、纵N像素)。此外,设分区p属于阶层Lx。The prohibited transformation list Lp for a specific partition p is generated using the following procedure. It is assumed that the size of the partition p is M×N pixels (M pixels horizontally and N pixels vertically). It is also assumed that the partition p belongs to the hierarchy Lx.

(过程S40)将Lp设定为空。(Process S40) Set Lp to null.

(过程S41)向Lp追加变换尺寸比M×N像素大的频率变换。(Process S41) A frequency transform having a transform size larger than M×N pixels is added to Lp.

(过程S42)向Lp追加根据Min(M、N)的值而决定的频率变换。(Process S42) Frequency conversion determined according to the value of Min(M, N) is added to Lp.

(过程S43)向Lp追加根据M÷N的值而决定的频率变换。(Process S43) Frequency conversion determined according to the value of M÷N is added to Lp.

(过程S44)向Lp追加根据阶层Lx的值而决定的频率变换。(Process S44) Frequency conversion determined according to the value of the layer Lx is added to Lp.

另外,表示是否比M×N像素大的信息、Min(M、N)的值、M÷N的值、阶层Lx的值是分区形状信息。Furthermore, information indicating whether or not the pixel size is larger than M×N pixels, the value of Min(M, N), the value of M÷N, and the value of the hierarchy level Lx are partition shape information.

<Min(M、N)的变换尺寸的限制><Min(M, N) Transformation Size Limitations>

参照图4的流程图来说明上述过程S42的更详细的过程。The above-mentioned process S42 will be described in more detail with reference to the flowchart of FIG. 4 .

(过程S50)若Min(M、N)为给定阈值Th1(例如Th1=16像素)以上,则进入过程S51,否则进入过程S52。(Process S50) If Min(M, N) is greater than a given threshold Th1 (for example, Th1 = 16 pixels), the process proceeds to process S51; otherwise, the process proceeds to process S52.

(过程S51)在频率变换列表内存在2个以上具有类似关系的变换尺寸的频率变换时,将各个具有类似关系的变换尺寸的频率变换的组中变换尺寸最小的频率变换(4×4DCT、8×1DCT、1×8DCT)追加到Lp之后进入过程S52。这里的类似关系中包括相似关系。例如,本实施方式的变换预置集之中的16×16和8×8和4×4的变换尺寸是类似关系。此外,在类似关系中还包括近似的相似关系。例如,本实施方式的变换预置集之中的16×1和8×1的变换尺寸、1×16和1×8的变换尺寸是类似关系。另外,虽然在以下的说明中不适用,但是可以根据频率变换的尺寸将频率变换分为正方形、纵长方形、横长方形的3类,将属于各类的频率变换看作类似关系。(Process S51) When there are two or more frequency transforms with similar transform sizes in the frequency transform list, the frequency transform with the smallest transform size (4×4 DCT, 8×1 DCT, 1×8 DCT) in the group of frequency transforms with similar transform sizes is added to Lp, and then process S52 is entered. The similar relationship here includes a similar relationship. For example, the transform sizes of 16×16, 8×8, and 4×4 in the transform preset set of this embodiment are similar. In addition, the similar relationship also includes an approximate similar relationship. For example, the transform sizes of 16×1 and 8×1, and the transform sizes of 1×16 and 1×8 in the transform preset set of this embodiment are similar. In addition, although it is not applicable in the following description, the frequency transforms can be divided into three categories according to their sizes: square, vertically rectangular, and horizontally rectangular, and the frequency transforms belonging to each category are considered to have a similar relationship.

(过程S52)若Min(M、N)为给定阈值Th2(例如Th2=32像素)以上则进入过程S53,否则结束处理。(Process S52) If Min (M, N) is greater than a given threshold Th2 (for example, Th2 = 32 pixels), the process proceeds to process S53; otherwise, the process ends.

(过程S53)在变换预置集内存在3个以上具有类似关系的变换尺寸的频率变换时,在各个具有类似关系的变换尺寸的频率变换的组中将第2小的变换尺寸的频率变换(8×8DCT)追加到Lp,之后结束处理。其中,Th2>Th1。(Step S53) If three or more frequency transforms with similar transform sizes exist within the transform preset set, the frequency transform with the second smallest transform size (8×8 DCT) is added to Lp in each group of frequency transforms with similar transform sizes, and the process is terminated. Here, Th2>Th1.

分区是动态补偿的单位,为了使用运动向量使以分区单位生成的预测图像接近输入图像,按照分区内的图像的帧间的运动一样的方式决定分区构成。即,对输入运动图像上的大的物体(或者其一部分)分配大的分区,对小的物体分配小的分区。一般,在输入运动图像上,与大的物体对应的区域的像素值的空间相关性,比与小的物体对应的区域的像素值的空间相关性高。因此,对于大的分区,大的变换尺寸的频率变换比小的变换尺寸的频率变换有效。因此,对于大的分区,即使在将某程度的小变换尺寸的频率变换作为禁止变换的情况下,编码数据的码量也基本不增加。Partitions are units of motion compensation. In order to use motion vectors to make the predicted image generated in partition units close to the input image, the partition structure is determined in a manner consistent with the inter-frame motion of the image within the partition. That is, large partitions are assigned to large objects (or parts thereof) on the input moving image, and small partitions are assigned to small objects. Generally, on the input moving image, the spatial correlation of pixel values in areas corresponding to large objects is higher than the spatial correlation of pixel values in areas corresponding to small objects. Therefore, for large partitions, frequency transforms with large transform sizes are more effective than frequency transforms with small transform sizes. Therefore, for large partitions, even when frequency transforms with a certain degree of small transform size are prohibited transforms, the amount of code in the encoded data does not increase substantially.

<M÷N的值的变换尺寸的限制><Restrictions on the Conversion Size of the M÷N Value>

下面,参照图5的流程图来说明上述过程S43的更详细的过程。Next, the above-mentioned process S43 will be described in more detail with reference to the flowchart of FIG. 5 .

(过程S60)若M÷N的值为2以上(分区p的横向长度为纵向长度的2倍以上)则进入过程S61,否则进入过程S63。(Process S60) If the value of M÷N is greater than 2 (the horizontal length of partition p is greater than or equal to twice the vertical length), then proceed to process S61; otherwise, proceed to process S63.

(过程S61)将全部具有正方形的变换尺寸的频率变换(4×4DCT、8×8DCT、16×16DCT)追加到Lp,之后进入过程S62。(Process S61) All frequency transforms having square transform sizes (4×4 DCT, 8×8 DCT, 16×16 DCT) are added to Lp, and then the process proceeds to process S62.

(过程S62)将变换尺寸的纵向长度比横向长度长的频率变换(8×16DCT、1×16DCT)追加到Lp,之后结束处理。(Process S62) A frequency transform (8×16 DCT, 1×16 DCT) whose transform size has a longer vertical length than horizontal length is added to Lp, and the process is terminated.

(过程S63)若M÷N的值为0.5以下(分区p的纵向长度为横向长度的2倍以上)则进入过程S64,否则进入过程S66。(Process S63) If the value of M÷N is less than 0.5 (the vertical length of partition p is more than twice the horizontal length), proceed to process S64; otherwise, proceed to process S66.

(过程S64)将全部具有正方形的变换尺寸的频率变换(4×4DCT、8×8DCT、16×16DCT)追加到Lp,之后进入过程S65。(Process S64) All frequency transforms having square transform sizes (4×4 DCT, 8×8 DCT, 16×16 DCT) are added to Lp, and then the process proceeds to process S65.

(过程S65)将变换尺寸的横向长度比纵向长度长的频率变换(16×8DCT、16×1DCT)追加到Lp,之后结束处理。(Process S65) A frequency transform (16×8 DCT, 16×1 DCT) whose transform size has a horizontal length longer than a vertical length is added to Lp, and the process is terminated.

(过程S66)若M÷N的值等于1(分区p的横向长度与纵向长度相等)则进入过程S67,否则结束处理。(Process S66) If the value of M÷N is equal to 1 (the horizontal length and vertical length of partition p are equal), then enter process S67, otherwise the processing ends.

(过程S67)将变换尺寸的横向长度与纵向长度不同的频率变换(16×8DCT、16×1DCT、8×16DCT、1×16DCT)追加到Lp。(Process S67) Frequency transforms (16×8 DCT, 16×1 DCT, 8×16 DCT, 1×16 DCT) having different horizontal and vertical lengths of transform sizes are added to Lp.

参照图6来说明上述过程S61以及过程S62的意图。如图6(a)所示,设在某阶层的处理单位U中存在两个物体(前景物体O和背景B),前景物体O与背景B的边界存在于处理单位U的下部。在该情况下,选择如图6(b)所示的、M÷N的值为2以上的横长方形的分区。相反,不选择如图6(c)所示的纵长方形的分区。The purpose of the above-described steps S61 and S62 will be explained with reference to FIG6 . As shown in FIG6( a ), assume that two objects (foreground object O and background B) exist within a processing unit U of a certain hierarchy, and the boundary between foreground object O and background B lies at the bottom of processing unit U. In this case, horizontal rectangular partitions, such as those shown in FIG6( b ), where the value of M÷N is 2 or greater, are selected. Conversely, vertical rectangular partitions, such as those shown in FIG6( c ), are not selected.

下面,参照图6(d)~(f)来说明在选择了横长方形分区时的、包括背景B和前景物体O双方的分区中的变换尺寸和编码数据的码量的关系。图6的(d)、(e)、(f)分别示出在所述分区中应用了正方形、横长方形、纵长方形的变换尺寸时的分区和变换尺寸的关系。采用正方形的变换尺寸的频率变换(图6(d))、或纵长方形的变换尺寸的频率变换(图6(f))时,在应用频率变换的区域内存在边界的情况多。The following describes the relationship between the transform size and the amount of code for the encoded data in a partition that includes both the background B and the foreground object O when horizontal rectangular partitions are selected, with reference to Figures 6(d) to 6(f). Figures 6(d), 6(e), and 6(f) show the relationship between the partition and the transform size when square, horizontal rectangular, and vertical rectangular transform sizes are applied to the partition, respectively. When frequency transforming with a square transform size (Figure 6(d)) or a vertical rectangular transform size (Figure 6(f)) is used, boundaries often exist within the region where the frequency transform is applied.

另一方面,采用横长方形的变换尺寸的频率变换(图6(e))时,在应用频率变换的区域内存在边界的情况少。在应用频率变换的区域内存在边界的情况下,不能通过频率变换使能量集中在变换系数的低频成分,所以变换系数的编码所需要的码量变多。另一方面,在应用频率变换的区域内不存在边界时,能通过频率变换使能量集中在变换系数的低频成分,所以变换系数的编码所需要的码量变少。所以对于横长方形的分区,应用横长方形的变换尺寸的频率变换时,比应用正方形或纵长方形的变换尺寸的频率变换时有效。因此,对于横长方形分区,在将正方形或者纵长方形的变换尺寸的频率变换设定为禁止变换时,编码数据的码量也基本不增加。On the other hand, when frequency transform with a horizontal rectangular transform size is used (Figure 6(e)), there are fewer cases where boundaries exist within the area where the frequency transform is applied. If there are boundaries within the area where the frequency transform is applied, the energy cannot be concentrated in the low-frequency components of the transform coefficients through the frequency transform, so the amount of code required to encode the transform coefficients increases. On the other hand, when there are no boundaries within the area where the frequency transform is applied, the energy can be concentrated in the low-frequency components of the transform coefficients through the frequency transform, so the amount of code required to encode the transform coefficients decreases. Therefore, for horizontal rectangular partitions, applying frequency transform with a horizontal rectangular transform size is more effective than applying frequency transform with a square or vertical rectangular transform size. Therefore, for horizontal rectangular partitions, when frequency transform with a square or vertical rectangular transform size is set to prohibit transformation, the amount of code required for encoding the data does not increase substantially.

上述过程S64以及过程S65的意图也与上述相同。即,对于纵长方形分区,在将正方形或者横长方形的变换尺寸的频率变换设定为禁止变换时,编码数据的码量也基本不增加。The purpose of the above-mentioned steps S64 and S65 is the same as that described above. That is, for the vertically rectangular partition, when the frequency transformation of the square or horizontally rectangular transformation size is set to be prohibited, the code amount of the encoded data is hardly increased.

参照图7来说明上述过程S66的意图。如图7(a)所示,设在某阶层的处理单位U中存在两个物体(前景物体O和背景B),前景物体O与背景B的边界存在于处理单位U的右下部。在该情况下,选择如图7(b)所示的、M÷N的值为1的正方形的分区。The purpose of the above-described step S66 will be explained with reference to FIG7 . As shown in FIG7( a ), assume that two objects (foreground object O and background B) exist within a processing unit U of a certain hierarchy, and the boundary between foreground object O and background B exists at the lower right portion of processing unit U. In this case, a square partition with an M÷N value of 1 is selected, as shown in FIG7( b ).

下面,参照图7(d)~(f)来说明在选择了正方形的分区时的、包括背景B和前景物体O双方的分区(右下分区)中的变换尺寸和编码数据的码量之间的关系。图7的(d)、(e)、(f)分别示出在右下分区中应用正方形、横长方形、纵长方形的变换尺寸时的分区和变换尺寸之间的关系。在该情况下,在采用正方形、纵长方形、横长方形的任一种变换尺寸的情况下,在应用频率变换的区域内存在边界的比例都没有太大变化。因此,关于右下分区,使用正方形、纵长方形、横长方形的任一种变换尺寸的频率变换,编码数据的码量的差异都较小。Next, referring to Figures 7(d) to (f), the relationship between the transform size and the amount of code for the encoded data in the partition (lower right partition) that includes both the background B and the foreground object O when a square partition is selected is explained. Figures 7(d), (e), and (f) respectively show the relationship between the partition and the transform size when a square, a horizontal rectangle, and a vertical rectangle are applied to the lower right partition. In this case, when any of the transform sizes of square, vertical rectangle, and horizontal rectangle are adopted, the proportion of the boundary in the area where the frequency transform is applied does not change much. Therefore, for the lower right partition, the difference in the amount of code for the encoded data is small when the frequency transform using any of the transform sizes of square, vertical rectangle, and horizontal rectangle is used.

另一方面,在处理单位U内的右下分区以外的分区中,仅包含背景B且不存在边界,所以使用任一种变换尺寸,在应用频率变换的区域内都不存在边界。因此,与采用纵长方形或横长方形的变换尺寸的频率变换相比,采用能够在水平方向(横方向)和垂直方向(纵向)的双方向上平衡性良好地有效地利用预测残差的像素值的空间相关性的正方形的变换尺寸的频率变换,更能够在变换系数中集中能量。所以,对于正方形的分区,正方形的变换尺寸的频率变换,比横长方形或纵长方形的变换尺寸的频率变换有效。因此,对于正方形的分区,在将横长方形或者纵长方形的变换尺寸的频率变换设定为禁止变换时,编码数据的码量也基本不增加。On the other hand, partitions other than the lower right partition within processing unit U contain only background B and have no boundaries. Therefore, regardless of the transform size, no boundaries exist within the region where the frequency transform is applied. Therefore, compared to frequency transforms using vertically or horizontally rectangular transform sizes, frequency transforms using square transform sizes, which effectively utilize the spatial correlation of the pixel values of the prediction residual in a well-balanced manner in both the horizontal (horizontal) and vertical (vertical) directions, are more able to concentrate energy in the transform coefficients. Therefore, for square partitions, frequency transforms using square transform sizes are more effective than frequency transforms using horizontally or vertically rectangular transform sizes. Therefore, for square partitions, even when frequency transforms using horizontally or vertically rectangular transform sizes are set as prohibited, the amount of code in the encoded data does not increase substantially.

<分区所属的阶层的变换尺寸的限制><Restrictions on the Transform Size of the Partition's Hierarchy>

下面,参照图8的流程图来说明上述过程S44的更详细的过程。Next, the above-mentioned process S44 will be described in more detail with reference to the flowchart of FIG. 8 .

(过程S70)若阶层Lx是最上位阶层则进入过程S71,否则进入过程S72。(Process S70) If the layer Lx is the highest layer, the process proceeds to process S71; otherwise, the process proceeds to process S72.

(过程S71)将具有相似形状的变换尺寸的多个频率变换候补(16×16DCT、8×8DCT、4×4DCT)中具有最大变换尺寸的频率变换以外的频率变换(8×8DCT、4×4DCT)追加到Lp,之后结束处理。(Process S71) A frequency transform other than the frequency transform having the largest transform size (8×8 DCT, 4×4 DCT) among a plurality of frequency transform candidates (16×16 DCT, 8×8 DCT, 4×4 DCT) having similar transform sizes is added to Lp, and the process is terminated.

(过程S72)若阶层Lx是最下位阶层则进入过程S73,否则结束处理。(Process S72) If the layer Lx is the lowest layer, the process proceeds to process S73; otherwise, the process ends.

(过程S73)将具有相似形状的变换尺寸的多个频率变换候补(16×16DCT、8×8DCT、4×4DCT)中具有最小变换尺寸的频率变换以外的频率变换(16×16DCT、8×8DCT)追加到Lp,之后结束处理。(Process S73) A frequency transform other than the frequency transform with the smallest transform size (16×16 DCT, 8×8 DCT) among a plurality of frequency transform candidates (16×16 DCT, 8×8 DCT, 4×4 DCT) having similar transform sizes is added to Lp, and the process is terminated.

在用阶层结构表现分区的情况下,即使利用属于最上位阶层的分区来限制变换尺寸比较小的一部分频率变换,编码数据的码量也基本不增加。这是因为,即使在最上位阶层不能选择特定的变换(例如8×8DCT或4×4DCT),也能够在下位阶层中选择。也就是说,在小变换尺寸的频率变换有效的区域中,通过不选择属于最上位阶层的分区而选择能够选择小变换尺寸的频率变换的下位阶层的分区,能够抑制编码数据的码量增加。尤其对于大的分区,立足于大变换尺寸的频率变换有效这一事实,在频率变换候补中存在多个相似形状的变换尺寸的频率变换时,优选在最上位阶层限制这些频率变换中小变换尺寸的频率变换。When partitions are represented using a hierarchical structure, even if the partitions belonging to the top hierarchy are used to restrict a portion of frequency transforms with relatively small transform sizes, the amount of code in the coded data will not increase substantially. This is because even if a specific transform (e.g., 8×8 DCT or 4×4 DCT) cannot be selected in the top hierarchy, it can be selected in the lower hierarchies. In other words, in areas where frequency transforms with small transform sizes are valid, by not selecting the partitions belonging to the top hierarchy and selecting partitions in the lower hierarchies where frequency transforms with small transform sizes are available, the increase in the amount of code in the coded data can be suppressed. In particular, for large partitions, based on the fact that frequency transforms with large transform sizes are valid, when there are multiple frequency transforms with similar transform sizes among the frequency transform candidates, it is preferable to restrict the frequency transforms with small transform sizes in the top hierarchy.

同样,在用阶层结构来表现分区的情况下,即使用属于最下位阶层的分区限制了变换尺寸比较大的一部分变换尺寸的频率变换,编码数据的码量也基本不增加。尤其对于小的分区,立足于小变换尺寸的频率变换有效这一事实,在频率变换候补中存在多个相似形状的变换尺寸的频率变换时,优选在最下位阶层限制这些频率变换中大变换尺寸的频率变换。Similarly, when partitions are represented using a hierarchical structure, even if frequency transforms with relatively large transform sizes are limited using the partitions belonging to the lowest hierarchy, the amount of code for the encoded data does not increase substantially. In particular, for small partitions, based on the fact that frequency transforms with small transform sizes are effective, when there are multiple frequency transforms with similar transform sizes among the frequency transform candidates, it is preferable to limit the frequency transforms with larger transform sizes in the lowest hierarchy.

<禁止变换列表生成处理的具体例><Specific Example of Prohibited Conversion List Generation Processing>

参照图9来介绍以上所说明的在变换制约导出部104中对特定的分区结构生成变换制约、即每个分区的禁止变换列表的过程的具体例。如图9所示,扩展MB在阶层L0被四分割后,左上部在阶层L1被一分割(分区a)、右上部在阶层L1被水平二分割(分区b、c)、左下部在阶层L1被垂直二分割(分区d、e)、右下部在阶层L1被四分割。A specific example of the process of generating transformation constraints, i.e., a prohibited transformation list for each partition, for a specific partition structure in the transformation constraint derivation unit 104, as described above, will be described with reference to FIG9 . As shown in FIG9 , after the extended MB is divided into four at level L0, the upper left portion is divided into one at level L1 (partition a), the upper right portion is divided into two horizontally at level L1 (partitions b and c), the lower left portion is divided into two vertically at level L1 (partitions d and e), and the lower right portion is divided into four at level L1.

对于在阶层L1被四分割的区域,左上部在阶层L2被一分割(分区f)、右上部在阶层L2被水平二分割(分区g、h)、左下部在阶层L2被垂直二分割(分区i、j)、右下部在阶层L2被四分割。对于在阶层L2被四分割的各部,在阶层L3被一分割(分区k、l、m、n)。如上所述,能够选择的频率变换的变换尺寸是4×4、8×8、16×16、16×1、1×16、8×1、1×8、16×8、8×16这9种。For the area divided into four parts at level L1, the upper left portion is divided into one part at level L2 (partition f), the upper right portion is divided into two parts horizontally at level L2 (partitions g and h), the lower left portion is divided into two parts vertically at level L2 (partitions i and j), and the lower right portion is divided into four parts at level L2. Each part divided into four parts at level L2 is divided into one part at level L3 (partitions k, l, m, and n). As described above, the nine selectable transform sizes for frequency transform are 4×4, 8×8, 16×16, 16×1, 1×16, 8×1, 1×8, 16×8, and 8×16.

分区a的大小是32×32像素,属于阶层L1。应用上述禁止变换列表的生成过程后,在过程S51将4×4、8×1、1×8、在过程S52将8×8、在过程S67将1×16、16×1、16×8、8×16的各变换尺寸的频率变换追加到禁止变换列表。Partition a is 32×32 pixels in size and belongs to level L1. After applying the aforementioned prohibited transform list generation process, frequency transforms for transform sizes of 4×4, 8×1, and 1×8 are added to the prohibited transform list in step S51, 8×8 in step S52, and 1×16, 16×1, 16×8, and 8×16 in step S67.

分区b、c的大小是32×16像素,属于阶层L1。应用上述禁止变换列表的生成过程后,在过程S51将4×4、8×1、1×8、在过程S61将4×4、8×8、16×16、在过程S62将1×16、8×16的各变换尺寸的频率变换追加到禁止变换列表。Partitions b and c are 32×16 pixels in size and belong to level L1. After applying the aforementioned prohibited transform list generation process, frequency transforms for transform sizes of 4×4, 8×1, and 1×8 are added to the prohibited transform list in step S51; 4×4, 8×8, and 16×16 are added in step S61; and 1×16 and 8×16 are added in step S62.

分区d、e的大小是16×32像素,属于阶层L1。应用上述禁止变换列表的生成过程后,在过程S51将4×4、8×1、1×8、在过程S64将4×4、8×8、16×16、在过程S65将16×1、16×8的各变换尺寸的频率变换追加到禁止变换列表。Partitions d and e are 16×32 pixels in size and belong to level L1. After applying the aforementioned prohibited transform list generation process, frequency transforms for transform sizes of 4×4, 8×1, and 1×8 are added to the prohibited transform list in step S51; 4×4, 8×8, and 16×16 are added in step S64; and 16×1 and 16×8 are added in step S65.

分区f的大小是16×16像素,属于阶层L2。应用上述禁止变换列表的生成过程后,在过程S51将4×4、8×1、1×8、在过程S67将16×1、1×16、16×8、8×16的各变换尺寸的频率变换追加到禁止变换列表。Partition f is 16×16 pixels in size and belongs to level L2. After applying the aforementioned prohibited transformation list generation process, frequency transforms for transform sizes of 4×4, 8×1, and 1×8 are added to the prohibited transformation list in step S51, and 16×1, 1×16, 16×8, and 8×16 are added to the prohibited transformation list in step S67.

分区g、h的大小是16×8像素,属于阶层L2。应用上述禁止变换列表的生成过程后,在过程S41将16×16、1×16、8×16、在过程S61将4×4、8×8、16×16、在过程S62将1×16、8×16的各变换尺寸的频率变换追加到禁止变换列表。Partitions g and h are 16×8 pixels in size and belong to level L2. After applying the aforementioned prohibited transform list generation process, frequency transforms for transform sizes of 16×16, 1×16, and 8×16 are added to the prohibited transform list in step S41; 4×4, 8×8, and 16×16 are added in step S61; and 1×16 and 8×16 are added in step S62.

分区i、j的大小是8×16像素,属于阶层L2。应用上述禁止变换列表的生成过程后,在过程S41将16×16、16×1、16×8、在过程S64将4×4、8×8、16×16、在过程S65将16×1、16×8的各变换尺寸的频率变换追加到禁止变换列表。Partitions i and j are 8×16 pixels in size and belong to level L2. After applying the aforementioned prohibited transform list generation process, frequency transforms for transform sizes of 16×16, 16×1, and 16×8 are added to the prohibited transform list in step S41; 4×4, 8×8, and 16×16 are added in step S64; and 16×1 and 16×8 are added in step S65.

分区k、l、m、n的大小是8×8像素,属于阶层L3。应用上述禁止变换列表的生成过程后,在过程S41将16×16、16×1、16×8、1×16、8×16、在过程S67将16×1、16×8、1×16、8×16、在过程5b将8×8、16×16的各变换尺寸的频率变换追加到禁止变换列表。Partitions k, l, m, and n are 8×8 pixels in size and belong to level L3. After applying the aforementioned prohibited transform list generation process, frequency transforms for transform sizes of 16×16, 16×1, 16×8, 1×16, and 8×16 are added to the prohibited transform list in step S41; 16×1, 16×8, 1×16, and 8×16 are added in step S67; and 8×8 and 16×16 are added in step 5b.

如上述例那样,对于其他具有分区结构的扩展MB,也可以对扩展MB内的各分区生成禁止变换列表后作为变换制约输出。As in the above example, for other extended MBs having a partition structure, a prohibited conversion list may be generated for each partition within the extended MB and output as a conversion constraint.

另外,在上述说明中,在禁止变换列表生成过程中,执行了全部过程S42、过程S43、过程S44,但是也可以仅使用这些过程中的一部分。此外,在过程S42的详细过程中,可以仅执行过程S50的判定和过程S51的判定中的任一方。此外,在过程S43的详细过程中,关于判定,可以仅执行过程S60、过程63、过程66的各判定的一部分,并且关于判定后的处理,可以仅执行过程S61和过程S62的任一方、过程S64和过程S65的任一方。此外,在过程S44的详细过程中,可以仅执行过程S70和过程72的判定的任一方。在进行了那样的过程省略的情况下,能够减轻禁止变换列表生成所需要的计算处理。In the above description, all of processes S42, S43, and S44 are executed during the prohibited transformation list generation process. However, only a portion of these processes may be used. Furthermore, in the detailed process of process S42, only one of the determinations in process S50 and process S51 may be executed. Furthermore, in the detailed process of process S43, only a portion of each of processes S60, S63, and S66 may be executed, and post-determination processing may only be executed for either process S61 or S62, or either process S64 or S65. Furthermore, in the detailed process of process S44, only one of the determinations in process S70 or process S72 may be executed. By omitting such processes, the computational processing required for prohibited transformation list generation can be reduced.

<频率变换决定部105><Frequency Conversion Determination Unit 105>

频率变换决定部105利用被输入的变换制约,决定在扩展MB内的各分区中应用的频率变换,并将该信息作为变换选择标志输出。决定在特定的分区p中应用的频率变换的过程如下所述。The frequency transformation determination unit 105 determines the frequency transformation to be applied to each partition within the extended MB using the input transformation constraints, and outputs this information as a transformation selection flag. The process of determining the frequency transformation to be applied to a specific partition p is as follows.

(过程S120)从变换制约提取与分区p对应的禁止变换列表Lp。(Process S120) Extract the prohibited transformation list Lp corresponding to the partition p from the transformation constraints.

(过程S121)取得变换预置集与禁止变换列表Lp的差集来作为变换候补列表Cp。(Process S121) A difference set between the preset conversion set and the prohibited conversion list Lp is obtained as a candidate conversion list Cp.

(过程S122)在变换候补列表Cp为空集的情况下,将变换预置集之中包含的正方形的变换尺寸的频率变换中最小的变换尺寸的频率变换追加到变换候补列表Cp中。为了避免不存在当禁止变换列表与变换预置集一致时要应用的频率变换的状况,需要该过程。在总是生成与变换预置集不一致的禁止变换列表的情况下,可以省略该过程。(Process S122) If the candidate transform list Cp is empty, the frequency transform with the smallest transform size among the frequency transforms for square transform sizes included in the transform preset set is added to the candidate transform list Cp. This process is necessary to avoid situations where there is no frequency transform to be applied when the prohibited transform list is consistent with the transform preset set. This process can be omitted if a prohibited transform list inconsistent with the transform preset set is always generated.

(过程S123)计算应用了变换候补列表Cp中包含的各频率变换时的率失真成本,将使率失真成本最小的频率变换作为在分区p中要应用的频率变换。(Process S123) The rate-distortion cost when each frequency transform included in the transform candidate list Cp is applied is calculated, and the frequency transform that minimizes the rate-distortion cost is set as the frequency transform to be applied to the partition p.

<可变长编码部108><Variable Length Coding Unit 108>

可变长编码部108根据被输入的变换系数、预测参数、变换制约和变换选择标志,生成并输出与扩展MB中的变换系数、预测参数和变换选择标志对应的编码数据。The variable-length coding unit 108 generates and outputs coded data corresponding to the transform coefficients, prediction parameters, transform constraints, and transform selection flag in the extended MB based on the input transform coefficients, prediction parameters, transform constraints, and transform selection flag.

利用现有的方法对变换系数和预测参数进行可变长编码后输出。利用变换制约对变换选择标志进行可变长编码后输出。以下,参照图10的流程图来说明变换选择标志的可变长编码过程。The transform coefficients and prediction parameters are variable-length coded using existing methods and then output. The transform selection flag is variable-length coded using transform constraints and then output. The variable-length coding process of the transform selection flag is described below with reference to the flowchart in FIG10 .

(过程S80)若扩展MB中的阶层L0的分割方法是四分割以外,则执行过程S81的处理,否则执行过程S82~过程S92的处理。(Process S80) If the division method of the layer L0 in the extended MB is other than four divisions, the process of process S81 is executed; otherwise, the processes of process S82 to process S92 are executed.

(过程S81)对表示应用于阶层L0的处理单位(64×64像素)内的各分区的频率变换的信息进行可变长编码,之后结束处理。(Process S81) Variable-length coding is performed on information indicating the frequency transformation applied to each partition within the processing unit (64×64 pixels) of the hierarchy L0, and the process is terminated.

(过程S82)分别对将阶层L0的处理单位进行四分割而得到的阶层L1的各处理单位(32×32像素),执行以下的过程S83~过程S92的处理。(Process S82) The following processes of process S83 to process S92 are executed for each processing unit (32×32 pixels) of the layer L1 obtained by dividing the processing unit of the layer L0 into four.

(过程S83)若当前的处理单位(32×32像素)中的阶层L1的分割方法是四分割以外,则进入过程S84,否则进入过程S85。(Step S83) If the division method of the layer L1 in the current processing unit (32×32 pixels) is other than four-division, the process proceeds to Step S84; otherwise, the process proceeds to Step S85.

(过程S84)对表示应用于当前的处理单位(32×32像素)内的各分区的频率变换的信息进行可变长编码,之后进入过程S92。(Step S84) Variable-length coding is performed on information indicating the frequency transformation applied to each partition within the current processing unit (32×32 pixels), and then the process proceeds to Step S92.

(过程S85)分别对将阶层L1的处理单位(32×32像素)进行四分割而得到的阶层L2的各处理单位(16×16像素),应用以下的过程S86~过程S91的处理。(Process S85) The following processes of process S86 to process S91 are applied to each of the processing units (16×16 pixels) of the layer L2 obtained by dividing the processing unit (32×32 pixels) of the layer L1 into four.

(过程S86)若当前的处理单位(16×16像素)中的阶层L2的分割方法是四分割以外,则进入过程S87,否则进入过程S88。(Step S86) If the division method of the layer L2 in the current processing unit (16×16 pixels) is other than four-division, the process proceeds to step S87; otherwise, the process proceeds to step S88.

(过程S87)对表示应用于当前的处理单位(16×16像素)内的各分区的频率变换的信息进行可变长编码,之后进入过程S91。(Step S87) Variable-length coding is performed on information indicating the frequency transformation applied to each partition within the current processing unit (16×16 pixels), and then the process proceeds to Step S91.

(过程S88)分别对将阶层L2的处理单位进行四分割而得到的阶层L3的各处理单位(8×8像素),执行以下的过程S89~过程S90的处理。(Process S88) The following processes of process S89 to process S90 are executed for each processing unit (8×8 pixels) of the layer L3 obtained by dividing the processing unit of the layer L2 into four.

(过程S89)对表示应用于当前的处理单位(8×8像素)内的各分区的频率变换的信息进行可变长编码,之后进入过程S90。(Step S89) Variable-length coding is performed on information indicating the frequency transformation applied to each partition within the current processing unit (8×8 pixels), and then the process proceeds to Step S90.

(过程S90)若全部处理单位(8×8像素)的处理结束,则进入过程S91。否则设定下一处理单位(8×8像素)并进入过程S89。(Step S90) If the processing of all processing units (8×8 pixels) is completed, the process proceeds to Step S91. Otherwise, the next processing unit (8×8 pixels) is set and the process proceeds to Step S89.

(过程S91)若全部处理单位(16×16像素)的处理结束,则进入过程S92。否则,设定下一处理单位(16×16像素)并进入过程S86。(Step S91) If the processing of all processing units (16×16 pixels) is completed, the process proceeds to step S92. Otherwise, the next processing unit (16×16 pixels) is set and the process proceeds to step S86.

(过程S92)若全部处理单位(32×32像素)的处理结束,则结束处理。否则设定下一处理单位(32×32像素)并进入过程S83。(Step S92) If the processing of all processing units (32×32 pixels) is completed, the processing is terminated. Otherwise, the next processing unit (32×32 pixels) is set and the process proceeds to Step S83.

利用以下的过程来执行与特定的分区p对应的变换选择标志的可变长编码。The variable-length coding of the transform selection flag corresponding to a specific partition p is performed using the following procedure.

(过程S130)从变换制约提取与分区p对应的禁止变换列表Lp。(Process S130) Extract the prohibited transformation list Lp corresponding to the partition p from the transformation constraints.

(过程S131)取得变换预置集与禁止变换列表Lp的差集来作为变换候补列表Cp。(Process S131) A difference set between the preset conversion set and the prohibited conversion list Lp is obtained as a candidate conversion list Cp.

(过程S132)在变换候补列表Cp是空集时,将变换预置集之中包含的正方形的变换尺寸的频率变换中最小的变换尺寸的频率变换追加到数变换候补列表Cp中。在该过程追加的频率变换不局限于上述频率变换,也可以是比变换预置集之中包含的其他分区p小的变换尺寸的频率变换。但是,需要与在频率变换决定部的过程S122所使用的频率变换相同。(Process S132) If the transform candidate list Cp is empty, the frequency transform with the smallest transform size among the frequency transforms of the square transform size included in the transform preset set is added to the transform candidate list Cp. The frequency transform added in this process is not limited to the frequency transforms described above; it may also be a frequency transform with a transform size smaller than that of other partitions p included in the transform preset set. However, it must be the same frequency transform used in process S122 of the frequency transform determination unit.

(过程S133)在变换候补列表Cp中包含的频率变换的数量仅是1个时,结束可变长编码处理。在该情况下,即使不将表示应用于分区p的频率变换的信息包含在编码数据中,也能够唯一地确定在数据的解码时要应用哪个频率变换,所以不会产生问题。(Process S133) If the number of frequency transforms included in the transform candidate list Cp is only one, the variable-length coding process is terminated. In this case, even if information indicating the frequency transform applied to partition p is not included in the encoded data, it is possible to uniquely determine which frequency transform to apply when decoding the data, so no problem occurs.

(过程S134)按照给定顺序重排变换候补列表Cp中包含的频率变换,并与从0开始每次增加1的索引建立对应。(Process S134) The frequency transforms included in the transform candidate list Cp are rearranged in a given order and associated with indexes that increase by 1 starting from 0.

(过程S135)对与应用于分区p的频率变换建立了关联的索引进行可变长编码。作为索引的可变长编码方法,例如,能够应用如下方法:在设频率变换候补列表的要素数为s时,利用使2的t次方成为s以上的最小的t,将索引值以t比特(bit)来二进制化而得到的比特串作为编码数据。(Step S135) Variable-length encoding is performed on the index associated with the frequency transform applied to partition p. As a variable-length encoding method for the index, for example, the following method can be applied: assuming the number of elements in the frequency transform candidate list is s, the index value is binarized with t bits using the smallest t such that 2 raised to the power of t is greater than s, and the resulting bit string is used as the encoded data.

若频率变换候补列表的要素数较少,则用于索引编码的码量变少。即,通过对各分区设定禁止变换,能够削减变换选择标志的编码所需要的码量。此外,若频率变换候补列表的要素数较少,则能够削减用于选择要应用的频率变换的编码处理的运算量。If the number of elements in the frequency transform candidate list is small, the amount of code required for index encoding is reduced. Specifically, by setting prohibited transforms for each partition, the amount of code required for encoding the transform selection flag can be reduced. Furthermore, if the number of elements in the frequency transform candidate list is small, the amount of computation required for encoding the frequency transform to be applied can be reduced.

另外,对于上述过程S134中的给定顺序,例如可以使用如下那样的顺序:对大变换尺寸的频率变换附加比小变换尺寸的频率变换小的索引、在变换尺寸为正方形时附加比横长方形时小的索引、在变换尺寸为横长方形时附加比纵长方形时小的索引。在该情况下,容易按照16×16DCT、16×8DCT、8×16DCT、8×8DCT、4×4DCT、16×1DCT、1×16DCT、8×1DCT、1×8DCT的顺序与小的索引建立对应。Furthermore, the order specified in step S134 can be, for example, as follows: a frequency transform with a large transform size is assigned a smaller index than a frequency transform with a small transform size; a square transform is assigned a smaller index than a horizontally rectangular transform; and a horizontally rectangular transform is assigned a smaller index than a vertically rectangular transform. In this case, it is easy to associate smaller indices in the order of 16×16 DCT, 16×8 DCT, 8×16 DCT, 8×8 DCT, 4×4 DCT, 16×1 DCT, 1×16 DCT, 8×1 DCT, and 1×8 DCT.

作为其他示例,还可以使上述过程S134中的给定顺序为各频率变换的选择频度高的顺序。具体而言,在开始输入运动图像的编码处理后,对变换预置集之中的各变换作为分区的变换被选择了几次进行计数,作成如下顺序:对选择次数多的频率变换分配较小的索引。在该情况下,因为索引的发生频度中也产生偏颇,所以在过程S135对索引进行可变长编码时的码量减少。另外,在新的帧的编码开始时、或给定个数的扩展MB的集合即片(slice)的编码开始时等适当的定时,可以将上述选择次数的系数值初始化为零等规定值。此外,也可以对带条件的频率变换选择次数,例如,每个分区尺寸的各频率变换的选择次数进行计数,并加以利用。As another example, the given order in the above-mentioned process S134 can also be the order of the highest selection frequency of each frequency transform. Specifically, after the encoding process of the input motion image is started, the number of times each transform in the transform preset set is selected as the transform of the partition is counted, and the following order is created: smaller indices are assigned to frequency transforms with higher selection frequencies. In this case, since a bias is also generated in the frequency of occurrence of the index, the amount of code when the index is variable-length encoded in process S135 is reduced. In addition, at appropriate timing such as when the encoding of a new frame starts or when the encoding of a set of a given number of extended MBs, i.e., a slice, starts, the coefficient value of the above-mentioned number of selections can be initialized to a specified value such as zero. In addition, the number of conditional frequency transform selections, for example, the number of selections of each frequency transform for each partition size, can also be counted and utilized.

此外,对于上述过程S135中的索引的可变长编码,还可以采用其他方法。例如可以使用由H.264/AVC规定的各种VLC、CABAC等。Furthermore, other methods may be used for variable-length coding of the index in the above-mentioned step S135, such as various VLCs and CABAC specified in H.264/AVC.

另外,可以不直接对索引进行可变长编码,而是对表示是否与索引预测值一致的标志进行编码,并仅在该标志表示不一致时才对索引进行可变长编码。使用完成编码的扩展MB的信息(局部解码图像、分区结构、运动向量等)来估计由处理对象分区利用的频率变换,可以将与该频率变换对应的索引作为索引预测值。尤其优选,考虑频率变换的空间相关性,根据在处理对象分区附近的分区应用的频率变换,导出索引预测值。具体而言,优选如下方法:分别导出在位于处理对象分区的左、上、右上的各分区所应用的频率变换的索引;并且,若这些索引的2个以上一致,则将该值作为索引预测值,否则将这些索引中的最小值作为索引预测值。In addition, instead of directly performing variable-length coding on the index, a flag indicating whether it is consistent with the index prediction value may be encoded, and the index may be variable-length coded only when the flag indicates inconsistency. The frequency transform used by the processing object partition is estimated using the information of the extended MB that has been coded (local decoded image, partition structure, motion vector, etc.), and the index corresponding to the frequency transform can be used as the index prediction value. In particular, it is preferred to derive the index prediction value based on the frequency transform applied to the partition near the processing object partition, taking into account the spatial correlation of the frequency transform. Specifically, the following method is preferred: the index of the frequency transform applied to each partition located to the left, above, and above the right of the processing object partition is derived separately; and if two or more of these indices are consistent, the value is used as the index prediction value, otherwise the minimum value among these indices is used as the index prediction value.

此外,在上述变换选择标志的可变长编码过程中,说明了对全部分区的变换选择标志进行可变长编码的情况,但是在规定对于属于特定的阶层Lx的同一处理单位的各分区应用的频率变换相同的基础上,可以按每个阶层Lx的处理单位来对处理单位内分区相同的变换选择标志进行可变长编码。在该情况下,虽然频率变换的选择的自由度降低,但是不需要按每个分区对变换选择标志进行编码,按每个阶层Lx的处理单位对变换选择标志进行编码即可,所以能够降低变换选择标志的编码所需要的码量。相反,进而将分区分割为不小于变换候补列表中包含的最大变换尺寸的频率变换的单位,可以以该单位对变换选择标志进行编码。Furthermore, the variable-length encoding process for the transform selection flags described above describes the case where the transform selection flags for all partitions are variable-length encoded. However, if the frequency transform applied to each partition belonging to the same processing unit of a specific layer Lx is the same, variable-length encoding can be performed on the transform selection flags common to the partitions within the processing unit for each layer Lx. While this reduces the degree of freedom in selecting a frequency transform, encoding the transform selection flag for each layer Lx processing unit eliminates the need to encode the transform selection flag for each partition, thus reducing the amount of code required to encode the transform selection flag. Alternatively, partitions can be further divided into units of frequency transforms whose size is not less than the maximum transform size included in the transform candidate list, and the transform selection flag can be encoded for each unit.

<预测残差重建部109><Prediction Residual Reconstruction Unit 109>

预测残差重建部109根据被输入的变换系数和变换选择标志来对变换系数进行逆频率变换,从而重建预测残差并输出。另外,在变换系数被量化的情况下,在应用频率变换之前,先应用逆量化。The prediction residual reconstruction unit 109 performs inverse frequency transformation on the transform coefficients based on the input transform coefficients and the transform selection flag, thereby reconstructing the prediction residual and outputting it. If the transform coefficients are quantized, inverse quantization is applied before frequency transformation.

<局部解码图像生成部110><Local Decoded Image Generator 110>

局部解码图像生成部110根据被输入的预测图像和预测残差,生成并输出局部解码图像。局部解码图像的各像素值为预测图像与预测残差对应的像素间的像素值的和。另外,为了降低在块边界发生的块失真,降低量化误差,可以对局部解码图像应用滤波器。The local decoded image generation unit 110 generates and outputs a local decoded image based on the input predicted image and prediction residual. Each pixel value in the local decoded image is the sum of the pixel values corresponding to the predicted image and the prediction residual. Furthermore, a filter may be applied to the local decoded image to reduce block distortion and quantization error at block boundaries.

<运动图像编码装置10的动作><Operation of Moving Image Coding Device 10>

接着,对运动图像编码装置10的动作进行说明。Next, the operation of the moving picture encoding device 10 will be described.

(过程S100)从外部输入到运动图像编码装置10的输入运动图像,以扩展MB为单位依次输入到预测参数决定部102以及预测残差生成部106,对各扩展MB,依次执行以后的S101~S109的处理。(Process S100) An input moving picture input from the outside to the moving picture encoding device 10 is sequentially input to the prediction parameter determination unit 102 and the prediction residual generation unit 106 in units of extended MBs, and the subsequent processes of S101 to S109 are sequentially performed on each extended MB.

(过程S101)在预测参数决定部102中,针对处理对象扩展MB,根据被输入的输入运动图像来决定预测参数,并输出到预测图像生成部103以及可变长编码部108。(Process S101 ) The prediction parameter determination unit 102 determines prediction parameters for the extended MB to be processed based on the input moving picture, and outputs the prediction parameters to the prediction picture generation unit 103 and the variable-length coding unit 108 .

(过程S102)在预测图像生成部103中,根据被输入的预测参数以及记录在帧存储器101中的局部解码图像,生成与输入运动图像中的处理对象扩展MB的区域近似的预测图像,并输出到预测残差生成部106以及局部解码图像生成部110。(Process S102) In the prediction image generation unit 103, a prediction image that is approximate to the area of the processing object extended MB in the input motion image is generated based on the input prediction parameters and the local decoded image recorded in the frame memory 101, and is output to the prediction residual generation unit 106 and the local decoded image generation unit 110.

(过程S103)在预测残差生成部106中,根据被输入的输入运动图像和预测图像,生成与处理对象扩展MB对应的预测残差,并输出到频率变换决定部105以及变换系数生成部107。(Process S103 ) The prediction residual generation unit 106 generates a prediction residual corresponding to the extended MB to be processed based on the input moving picture and the predicted picture, and outputs it to the frequency transformation determination unit 105 and the transform coefficient generation unit 107 .

(过程S104)在变换制约导出部104中,根据被输入的预测参数,导出关于处理对象扩展MB的各分区中的频率变换的制约来作为变换制约,并输出到频率变换决定部105以及可变长编码部108。(Process S104) The transform constraint derivation unit 104 derives constraints on frequency transform in each partition of the extended MB to be processed as transform constraints based on the input prediction parameters, and outputs them to the frequency transform determination unit 105 and the variable length coding unit 108.

(过程S105)在频率变换决定部105中,根据被输入的变换制约和预测残差,决定应用于处理对象扩展MB的各分区的频率变换,并作为变换选择标志输出到变换系数生成部107以及可变长编码部108以及预测残差重建部109。(Process S105) In the frequency transform determination unit 105, the frequency transform applied to each partition of the processing target extended MB is determined based on the input transform constraint and prediction residual, and is output as a transform selection flag to the transform coefficient generation unit 107, the variable length coding unit 108, and the prediction residual reconstruction unit 109.

(过程S106)在变换系数生成部107中,将由被输入的变换选择标志规定的频率变换应用到被输入的预测残差,生成与处理对象扩展MB对应的变换系数,并输出到可变长编码部108以及预测残差重建部109。(Process S106) In the transform coefficient generation unit 107, the frequency transform specified by the input transform selection flag is applied to the input prediction residual to generate the transform coefficient corresponding to the processing target extended MB, and output it to the variable length coding unit 108 and the prediction residual reconstruction unit 109.

(过程S107)在预测残差重建部109中,将与由被输入的变换选择标志规定的频率变换对应的逆频率变换应用到被输入的变换系数,重建与处理对象扩展MB对应的预测残差,并输出到局部解码图像生成部110。(Process S107) In the prediction residual reconstruction unit 109, the inverse frequency transform corresponding to the frequency transform specified by the input transform selection flag is applied to the input transform coefficient, and the prediction residual corresponding to the processing target extended MB is reconstructed and output to the local decoded image generation unit 110.

(过程S108)在局部解码图像生成部110中,根据被输入的预测残差和预测图像来生成局部解码图像,并输出到帧存储器101进行记录。(Process S108) The local decoded image generation unit 110 generates a local decoded image based on the input prediction residual and the predicted image, and outputs the local decoded image to the frame memory 101 for storage.

(过程S109)在可变长编码部108中,利用被输入的变换制约,对被输入的变换系数、预测参数以及变换选择标志进行可变长编码,并将其结果作为编码数据输出到外部。(Process S109) In the variable-length coding unit 108, variable-length coding is performed on the input transform coefficients, prediction parameters, and transform selection flag using the input transform constraints, and the result is output to the outside as coded data.

通过上述过程,在运动图像编码装置10中,可以对被输入的输入运动图像进行编码从而生成编码数据并输出到外部。Through the above-described process, the moving picture encoding device 10 can encode the input moving picture to generate encoded data and output it to the outside.

<运动图像解码装置20的构成><Configuration of Moving Image Decoding Device 20>

下面,说明对由运动图像编码装置10进行了编码的编码数据进行解码从而生成解码运动图像的运动图像解码装置20。Next, a description will be given of a moving picture decoding device 20 that decodes the coded data coded by the moving picture coding device 10 to generate a decoded moving picture.

图11是表示图像解码装置20的构成的框图。运动图像解码装置20包括帧存储器101、预测图像生成部103、变换制约导出部104、预测残差重建部109、局部解码图像生成部110以及可变长符号解码部201。11 is a block diagram showing the configuration of the image decoding device 20 . The moving picture decoding device 20 includes a frame memory 101 , a prediction image generation unit 103 , a transform constraint derivation unit 104 , a prediction residual reconstruction unit 109 , a local decoded image generation unit 110 , and a variable length code decoding unit 201 .

可变长符号解码部201根据被输入的编码数据和变换制约,对预测参数以及变换选择标志以及变换系数进行解码并输出。具体而言,首先,从编码数据对预测参数进行解码并输出。接着,利用变换制约,从编码数据对变换选择标志进行解码并输出。最后,利用变换选择标志,从编码数据对变换系数进行解码并输出。The variable-length code decoding unit 201 decodes and outputs prediction parameters, a transform selection flag, and transform coefficients based on the input coded data and transform constraints. Specifically, the prediction parameters are first decoded from the coded data and output. Next, the transform selection flag is decoded from the coded data using the transform constraints and output. Finally, the transform coefficients are decoded from the coded data using the transform selection flag and output.

<运动图像解码装置20的动作><Operation of Video Decoding Device 20>

接着,对运动图像解码装置20的动作进行说明。Next, the operation of the moving picture decoding device 20 will be described.

(过程S110)从外部输入到运动图像解码装置20的编码数据,以扩展MB为单位依次输入到可变长符号解码部201,对与各扩展MB对应的编码数据,依次执行以后的S111~S117的处理。(Process S110) The coded data input from the outside to the moving picture decoding device 20 is sequentially input to the variable length code decoding unit 201 in units of extended MBs, and the subsequent processes S111 to S117 are sequentially performed on the coded data corresponding to each extended MB.

(过程S111)在可变长符号解码部201中,从被输入的编码数据,对与处理对象扩展MB对应的预测参数进行解码,并输出到预测图像生成部103以及变换制约导出部104。(Process S111 ) The variable-length code decoding unit 201 decodes the prediction parameters corresponding to the processing target extended MB from the input coded data, and outputs the decoded prediction parameters to the predicted image generating unit 103 and the transform constraint deriving unit 104 .

(过程S112)在变换制约导出部104中,根据被输入的预测参数,导出关于处理对象扩展MB的各分区中的频率变换的制约来作为变换制约,并输出到可变长符号解码部201。(Process S112 ) The transform constraint derivation unit 104 derives constraints on frequency transform in each partition of the processing target extended MB based on the input prediction parameters as transform constraints, and outputs them to the variable length code decoding unit 201 .

(过程S113)在可变长符号解码部201中,根据被输入的编码数据和变换制约,对与处理对象MB对应的变换选择标志进行解码,并输出到预测残差重建部109。(Process S113) The variable length code decoding unit 201 decodes the transform selection flag corresponding to the processing target MB based on the input coded data and transform constraints, and outputs it to the prediction residual reconstruction unit 109.

(过程S114)在可变长符号解码部201中,根据被输入的编码数据和在(过程S113)导出的变换选择标志,对与处理对象扩展MB对应的变换系数进行解码,并输出到预测残差重建部109。(Process S114) In the variable-length code decoding unit 201, based on the input coded data and the transform selection flag derived in (Process S113), the transform coefficients corresponding to the processing target extended MB are decoded and output to the prediction residual reconstruction unit 109.

(过程S115)在预测图像生成部103中,根据被输入的预测参数以及记录在帧存储器101中的局部解码图像,生成与处理对象扩展MB对应的预测图像并输出到局部解码图像生成部110。(Process S115 ) The predicted image generator 103 generates a predicted image corresponding to the extended MB to be processed based on the input prediction parameters and the local decoded image stored in the frame memory 101 , and outputs the predicted image to the local decoded image generator 110 .

(过程S116)在预测残差重建部109中,将与由被输入的变换选择标志规定的频率变换对应的逆频率变换应用到被输入的变换系数,重建与处理对象扩展MB对应的预测残差,并输出到局部解码图像生成部110。(Process S116) In the prediction residual reconstruction unit 109, the inverse frequency transform corresponding to the frequency transform specified by the input transform selection flag is applied to the input transform coefficient, and the prediction residual corresponding to the processing target extended MB is reconstructed and output to the local decoded image generation unit 110.

(过程S117)在局部解码图像生成部110中,根据被输入的预测残差和预测图像来生成局部解码图像并输出到帧存储器101中进行记录,并且作为与处理对象块对应的解码运动图像上的区域,输出到外部。(Process S117) In the local decoded image generation unit 110, a local decoded image is generated based on the input prediction residual and the prediction image, and is output to the frame memory 101 for storage, and is output to the outside as a region on the decoded moving image corresponding to the processing target block.

如以上说明的那样,根据运动图像解码装置20,可以从由运动图像编码装置10生成的编码数据生成解码运动图像。As described above, the moving picture decoding device 20 can generate a decoded moving picture from the encoded data generated by the moving picture encoding device 10 .

<附注事项1:关于分区尺寸、所属的阶层以外的信息的利用><Supplementary Note 1: Use of Information Other Than Partition Size and Hierarchy>

另外,在上述运动图像编码装置10、运动图像解码装置20的说明中,说明了扩展MB内的每个分区的禁止变换列表仅根据分区尺寸、分区所属的阶层而生成的方式,但是也可以使用根据编码数据中包含的信息进行解码时能够再现的其他信息。例如,可以将预测参数中包含的运动向量、参考图像索引运用于禁止变换列表的导出。Furthermore, in the description of the moving picture encoding device 10 and the moving picture decoding device 20, a method has been described in which a prohibited transform list is generated for each partition within an extended MB based solely on the partition size and the hierarchy to which the partition belongs. However, other information that can be reproduced during decoding based on information included in the encoded data may also be used. For example, a motion vector or reference picture index included in the prediction parameters may be used to derive the prohibited transform list.

以下示出在特定的分区p中使用运动向量和参考图像索引向禁止变换列表追加频率变换的过程。另外,将分区p的运动向量设为mvp、将参考图像索引设为refp。此外,在与分区p的上边邻接的分区中将位于左端的分区(分区u)的运动向量设为mvu、将参考图像索引设为refu。此外,在与分区p的左边邻接的分区中将位于上端的分区(分区1)的运动向量定义为mvl、将参考图像索引定义为refl。The following describes the process of adding a frequency transform to the prohibited transform list using a motion vector and a reference image index in a specific partition p. Furthermore, the motion vector of partition p is set to mvp, and the reference image index is set to refp. Furthermore, among the partitions adjacent to the upper edge of partition p, the motion vector of the partition located to the left (partition u) is set to mvu, and the reference image index is set to refu. Furthermore, among the partitions adjacent to the left edge of partition p, the motion vector of the partition located above (partition 1) is defined as mvl, and the reference image index is defined as refl.

(过程S140)在mvp、mvu和mvl全部一致,并且refp、refu和refl全部一致时,进入过程S141。否则结束处理。(Process S140) If mvp, mvu, and mvl are all consistent, and refp, refu, and refl are all consistent, the process proceeds to process S141. Otherwise, the process ends.

(过程S141)在频率变换列表内中存在2个以上具有类似关系的变换尺寸的频率变换时,将在各个具有类似关系的变换尺寸的频率变换的组中变换尺寸最小的频率变换追加到Lp,之后结束处理.(Process S141) When there are two or more frequency transforms with similar transform sizes in the frequency transform list, the frequency transform with the smallest transform size in the group of frequency transforms with similar transform sizes is added to Lp, and then the process ends.

在邻接块间运动向量一致,意味着在编码对象的扩展MB附近的局部区域内运动向量的空间相关性高。在运动向量的空间相关性高的情况下,具有像素值的空间相关性也高的趋势,所以即使在具有类似变换尺寸的频率变换中禁止应用小变换尺寸的频率变换,编码数据的码量增加也很微小。The consistency of motion vectors between adjacent blocks indicates that the spatial correlation of motion vectors is high in the local area near the extended MB to be coded. High spatial correlation of motion vectors tends to also increase the spatial correlation of pixel values. Therefore, even if frequency transforms with similar transform sizes are prohibited from being applied to frequency transforms with small transform sizes, the increase in the amount of coded data is minimal.

另外,在上述说明中,将用于禁止变换列表导出的运动向量、参考图像索引作为了分区p的邻接分区的运动向量、参考图像索引,但是也可以使用其他运动向量。例如,可以使用与分区p所属的扩展MB(处理对象扩展MB)邻接的扩展MB内的运动向量。具体而言,将与处理对象扩展MB的左侧邻接的扩展MB内位于右上的分区的运动向量用作mvl,将与处理对象扩展MB的上侧邻接的扩展MB内位于左下的分区的运动向量用作mvu。在该情况下,在扩展MB内的全部分区利用同一mvl、mvu,所以可以以分区为单位来并列执行过程S140、S141的处理。In addition, in the above description, the motion vector and reference image index used for deriving the prohibited transformation list are used as the motion vector and reference image index of the adjacent partition of partition p, but other motion vectors can also be used. For example, the motion vector within the extended MB adjacent to the extended MB (processing target extended MB) to which partition p belongs can be used. Specifically, the motion vector of the partition located in the upper right of the extended MB adjacent to the left side of the processing target extended MB is used as mvl, and the motion vector of the partition located in the lower left of the extended MB adjacent to the upper side of the processing target extended MB is used as mvu. In this case, all partitions within the extended MB use the same mvl and mvu, so the processing of processes S140 and S141 can be performed in parallel on a partition basis.

<附注事项2:关于禁止变换列表的生成定时><Supplementary Note 2: Regarding the Timing of Generating the Prohibited Conversion List>

另外,在上述运动图像编码装置10、运动图像解码装置20的说明中,说明了变换制约导出部104按扩展MB的每个分区随时执行禁止变换列表的生成处理,但是在仅根据分区尺寸、分区所属的阶层来执行频率变换向禁止变换列表的追加的情况下,可以在给定定时事前生成禁止频率变换列表。在该情况下,需要将按照分区的每个种类而在事前生成的禁止变换列表在变换制约导出部104中与扩展MB内的各分区建立关联。所谓的所述给定定时,可以是紧接输入运动图像的编码开始或者编码数据的解码开始之后,也可以紧接序列(sequence)、帧、片等给定编码单位的编码或者解码处理的开始之后。因为使禁止变换列表的生成处理的执行次数减少,所以能够降低编码以及解码处理的处理量。Furthermore, in the description of the moving image encoding device 10 and the moving image decoding device 20, the transform constraint derivation unit 104 has been described as executing the prohibited transform list generation process for each partition of the extended MB on a continuous basis. However, if frequency transforms are added to the prohibited transform list based solely on the partition size and the hierarchy to which the partition belongs, the prohibited frequency transform list can be generated in advance at a predetermined timing. In this case, the prohibited transform list generated in advance for each partition type must be associated with each partition within the extended MB in the transform constraint derivation unit 104. The predetermined timing can be immediately after the start of encoding of the input moving image or the start of decoding of the encoded data, or immediately after the start of encoding or decoding of a predetermined coding unit such as a sequence, frame, or slice. By reducing the number of times the prohibited transform list generation process is executed, the encoding and decoding processing load can be reduced.

相反,在频率变换向禁止变换列表的追加时使用运动向量、参考图像索引的情况下,如在上述运动图像编码装置10、运动图像解码装置20说明的那样,需要按每个扩展MB随时执行禁止变换列表的生成处理。在该情况下,虽然由于禁止变换列表的生成处理次数的增加而编码以及解码处理的处理量增加,但是与不按每个MB执行生成处理的情况相比,能使用能根据编码数据导出的更多的信息,从而能够生成更适合运动图像的局部性质的禁止变换列表。In contrast, when adding a frequency transform to the prohibited transform list using a motion vector or reference picture index, it is necessary to perform the prohibited transform list generation process on a continuous basis for each extended MB, as described above with respect to the moving picture encoding apparatus 10 and the moving picture decoding apparatus 20. In this case, while the increased number of prohibited transform list generation processes increases the encoding and decoding processing load, compared to a case where generation is not performed for each MB, more information that can be derived from the coded data can be used, enabling generation of a prohibited transform list that is more appropriate to the local characteristics of the moving picture.

(实施方式2)(Implementation Method 2)

下面,参照图12~图14来说明作为本发明的运动图像编码装置以及运动图像解码装置的另一实施方式的运动图像编码装置11以及运动图像解码装置21。另外,在附图的说明中,对同一要素标注同一符号从而省略说明。12 to 14 , a moving picture encoding device 11 and a moving picture decoding device 21 as another embodiment of the moving picture encoding device and the moving picture decoding device of the present invention will be described. In the description of the drawings, the same elements are denoted by the same reference numerals and their description will be omitted.

在本实施方式中的运动图像编码装置11以及运动图像解码装置21中,特征在于:通过变换候补导出部111来置换运动图像编码装置10、运动图像解码装置20中的变换制约导出部104,从而不生成禁止变换列表而直接导出变换候补列表。The motion image encoding device 11 and the motion image decoding device 21 of this embodiment are characterized in that the transformation constraint derivation unit 104 in the motion image encoding device 10 and the motion image decoding device 20 is replaced by the transformation candidate derivation unit 111, so that the prohibited transformation list is not generated and the transformation candidate list is directly derived.

另外,将变换制约导出部104和变换候补导出部111一起称为变换控制导出部。The conversion constraint derivation unit 104 and the conversion candidate derivation unit 111 are collectively referred to as a conversion control derivation unit.

图12是表示运动图像编码装置11的构成的框图。运动图像编码装置11包括帧存储器101、预测参数决定部102、预测图像生成部103、预测残差生成部106、变换系数生成部107、预测残差重建部109、局部解码图像生成部110、变换候补导出部111、频率变换决定部112、可变长编码部113。FIG12 is a block diagram showing the configuration of the moving picture coding apparatus 11. The moving picture coding apparatus 11 includes a frame memory 101, a prediction parameter determination unit 102, a prediction image generation unit 103, a prediction residual generation unit 106, a transform coefficient generation unit 107, a prediction residual reconstruction unit 109, a local decoded image generation unit 110, a transform candidate derivation unit 111, a frequency transform determination unit 112, and a variable length coding unit 113.

变换候补导出部111根据被输入的预测参数,将关于在扩展MB内的各分区中能够选择的频率变换的信息作为变换候补列表输出。即,根据由预测参数决定的各分区的分区形状信息,生成该分区的变换候补列表。The transform candidate derivation unit 111 outputs information on frequency transforms selectable in each partition within the extended MB as a transform candidate list based on the input prediction parameters. In other words, the transform candidate list for each partition is generated based on the partition shape information of the partition determined by the prediction parameters.

变换候补列表分别与扩展MB内的各分区建立了对应,在变换预置集之中包含的频率变换中,规定了由各分区能够选择的频率变换的集合。The transform candidate list is associated with each partition in the extended MB, and defines a set of frequency transforms that can be selected by each partition among the frequency transforms included in the transform preset set.

通过如下的过程来生成对特定的分区p的变换候补列表Cp。另外,设分区p的大小为M×N像素(横M像素、纵N像素)。此外,设分区p属于阶层Lx。The conversion candidate list Cp for a specific partition p is generated by the following process: Assume that the size of the partition p is M×N pixels (M pixels horizontally and N pixels vertically). Assume that the partition p belongs to the hierarchy Lx.

(过程S150)将根据M和N的大小关系而决定的频率变换追加到Cp。(Process S150) A frequency conversion determined according to the magnitude relationship between M and N is added to Cp.

(过程S151)在Cp为空的情况下将具有比全部分区尺寸小的变换尺寸的频率变换中最大的变换尺寸的频率变换追加到Cp。(Process S151) If Cp is empty, a frequency transform having a transform size smaller than all partition sizes and having the largest transform size among frequency transforms is added to Cp.

参照图13的流程图来说明上述过程S150的更详细的过程。The above-mentioned process S150 will be described in more detail with reference to the flowchart of FIG. 13 .

(过程S160)使用给定值Th3(例如在以下Th3=16),对M1设定Min(M、Th3)的值、对N1设定Min(N、Th3)的值。另外,关于Th3的值,优选设定为变换预置集之中包含的最大的正方形的变换尺寸的频率变换中的变换尺寸的一边的长度。若在变换预置集之中存在具有变换尺寸M1×N1的变换尺寸的频率变换,则将该频率变换追加到变换候补列表Cp中,进入过程S161。(Step S160) Using a given value Th3 (e.g., Th3 = 16 below), the values of Min(M, Th3) are set for M1, and Min(N, Th3) are set for N1. Th3 is preferably set to the length of one side of the transform size of the largest square frequency transform included in the transform preset set. If a frequency transform with a transform size of M1 × N1 exists in the transform preset set, that frequency transform is added to the candidate transform list Cp, and the process proceeds to Step S161.

(过程S161)在M比N大的情况下(分区p是横长方形的情况下)进入过程S162,否则进入过程S163。(Process S161) When M is larger than N (when the partition p is a horizontal rectangle), proceed to process S162; otherwise, proceed to process S163.

(过程S162)若在变换预置集之中存在具有变换尺寸M1×1的变换尺寸的频率变换,则将该频率变换追加到变换候补列表Cp中,结束处理。(Process S162) If there is a frequency transform having a transform size of M1×1 in the transform preset set, the frequency transform is added to the transform candidate list Cp, and the process ends.

(过程S163)在M比N小的情况下(分区p是纵长方形的情况下)进入过程S164,否则进入过程S165。(Process S163) When M is smaller than N (when the partition p is a vertically long rectangle), the process proceeds to process S164; otherwise, the process proceeds to process S165.

(过程S164)若在变换预置集之中存在具有变换尺寸1×N1的变换尺寸的频率变换,则将该频率变换追加到变换候补列表Cp,结束处理。(Process S164) If there is a frequency transform having a transform size of 1×N1 in the transform preset set, the frequency transform is added to the transform candidate list Cp, and the process ends.

(过程S165)向M2设定M1÷2的值、向N2设定N1÷2的值。若在变换预置集之中存在具有变换尺寸M2×N2的变换尺寸的频率变换,则将该频率变换追加到变换候补列表Cp中,结束处理。另外,该过程在M和N相等的情况下(分区p是正方形的情况下)被执行。(Step S165) M2 is set to the value of M1÷2, and N2 is set to the value of N1÷2. If a frequency transform with a transform size of M2×N2 exists in the transform preset set, the frequency transform is added to the transform candidate list Cp, and the process ends. This step is executed when M and N are equal (when partition p is a square).

上述M和N的大小关系、分区尺寸M×N是分区形状信息。The above-mentioned size relationship between M and N and the partition size M×N are partition shape information.

在上述过程中,对于横长方形(纵长方形)的分区,若在变换预置集之中存在,则分区的纵(横)的长度比高度(宽度)短的变换尺寸的频率变换被追加到变换候补列表Cp中。如在运动图像编码装置10的变换制约导出部104中的禁止变换列表的导出过程的说明中参照图6所提到的那样,对于横长方形(纵长方形)的分区,横长方形(纵长方形)的变换尺寸的频率变换有效。尤其通过使用短边的长度相对于长边非常短的变换尺寸的频率变换,能够使变换尺寸内存在物体的边界的情况变少,能够提高频率变换的向变换系数的低频成分的能量集中效果。In the above process, if a horizontally rectangular (vertically rectangular) partition exists in the transform preset set, a frequency transform of a transform size in which the partition's vertical (horizontal) length is shorter than its height (width) is added to the transform candidate list Cp. As mentioned with reference to FIG6 in the description of the prohibited transform list derivation process in the transform constraint derivation unit 104 of the motion picture coding device 10, a frequency transform of a horizontally rectangular (vertically rectangular) transform size is effective for a horizontally rectangular (vertically rectangular) partition. In particular, by using a frequency transform of a transform size in which the short side is significantly shorter than the long side, the presence of object boundaries within the transform size can be reduced, thereby enhancing the frequency transform's effect of concentrating energy on the low-frequency components of the transform coefficients.

频率变换决定部112利用被输入的变换候补列表,决定在扩展MB内的各分区中应用的频率变换,并将该信息作为变换选择标志输出。具体而言,计算应用变换候补列表Cp中包含的各频率变换时的率失真成本,将使率失真成本最小的频率变换作为在分区p中要应用的频率变换。The frequency transform determination unit 112 uses the input candidate transform list to determine the frequency transform to be applied to each partition within the extended MB and outputs this information as a transform selection flag. Specifically, the frequency transform determination unit 112 calculates the rate-distortion cost when applying each frequency transform included in the candidate transform list Cp and selects the frequency transform that minimizes the rate-distortion cost as the frequency transform to be applied to partition p.

可变长编码部113根据被输入的变换系数、预测参数、可变长编码、变换候补列表和变换选择标志,生成并输出与扩展MB中的变换系数、预测参数和变换选择标志对应的编码数据。The variable length coding unit 113 generates and outputs coded data corresponding to the transform coefficient, prediction parameter, variable length code, transform candidate list, and transform selection flag in the extended MB based on the input transform coefficient, prediction parameter, variable length code, and transform selection flag.

扩展MB内的各分区中的变换选择标志的可变长编码过程,与运动图像编码装置10的可变长编码部108中的过程S80~过程S92(图10)相同。作为特定的分区p中的详细的变换选择标志的可变长编码过程,适用可变长编码部108中的过程S133~过程S135。The variable-length coding process for the transform selection flag in each partition within the extended MB is the same as processes S80 to S92 ( FIG. 10 ) in the variable-length coding unit 108 of the moving picture coding apparatus 10. As the detailed variable-length coding process for the transform selection flag in a specific partition p, processes S133 to S135 in the variable-length coding unit 108 are applied.

接着,对运动图像编码装置11的动作进行说明。Next, the operation of the moving picture encoding device 11 will be described.

(过程S170)从外部输入到运动图像编码装置11的输入运动图像,以扩展MB为单位依次输入到预测参数决定部102以及预测残差生成部106,对于各扩展MB,依次执行以后的S171~S179的处理。(Process S170) The input moving picture input from the outside to the moving picture encoding device 11 is sequentially input to the prediction parameter determination unit 102 and the prediction residual generation unit 106 in units of extended MBs, and the subsequent processes of S171 to S179 are sequentially performed on each extended MB.

(过程S171)在预测参数决定部102中,对于处理对象扩展MB,根据被输入的输入运动图像来决定预测参数,并输出到预测图像生成部103以及可变长编码部113。(Process S171) The prediction parameter determination unit 102 determines prediction parameters for the extended MB to be processed based on the input moving picture, and outputs the parameters to the predicted picture generation unit 103 and the variable-length coding unit 113.

(过程S172)在预测图像生成部103中,根据被输入的预测参数以及记录在帧存储器101中的局部解码图像,生成与输入运动图像中的处理对象扩展MB的区域近似的预测图像,并输出到预测残差生成部106以及局部解码图像生成部110。(Process S172) In the prediction image generation unit 103, a prediction image that is approximate to the area of the processing object extended MB in the input motion image is generated based on the input prediction parameters and the local decoded image recorded in the frame memory 101, and is output to the prediction residual generation unit 106 and the local decoded image generation unit 110.

(过程S173)在预测残差生成部106中,根据被输入的输入运动图像和预测图像,生成与处理对象扩展MB对应的预测残差,并输出到频率变换决定部112以及变换系数生成部107。(Process S173 ) The prediction residual generation unit 106 generates a prediction residual corresponding to the extended MB to be processed based on the input moving picture and the predicted picture, and outputs it to the frequency transformation determination unit 112 and the transform coefficient generation unit 107 .

(过程S174)在变换候补导出部111中,根据被输入的预测参数,导出关于处理对象扩展MB的各分区中的频率变换的制约,并输出到频率变换决定部112以及可变长编码部113。(Process S174) The transform candidate derivation unit 111 derives constraints on frequency transform in each partition of the processing target extended MB based on the input prediction parameters, and outputs the constraints to the frequency transform determination unit 112 and the variable length coding unit 113.

(过程S175)在频率变换决定部112中,根据被输入的变换制约和预测残差,决定应用于处理对象扩展MB的各分区的频率变换,作为变换选择标志输出到变换系数生成部107以及可变长编码部113以及预测残差重建部109。(Process S175) In the frequency transformation determination unit 112, the frequency transformation applied to each partition of the processing object extended MB is determined based on the input transformation constraints and prediction residuals, and is output as a transformation selection flag to the transformation coefficient generation unit 107, the variable length coding unit 113, and the prediction residual reconstruction unit 109.

(过程S176)在变换系数生成部107中,将由被输入的变换选择标志规定的频率变换应用于被输入的预测残差,从而生成与处理对象扩展MB对应的变换系数,并输出到可变长编码部108以及预测残差重建部109。(Process S176) In the transform coefficient generation unit 107, the frequency transform specified by the input transform selection flag is applied to the input prediction residual, thereby generating a transform coefficient corresponding to the processing target extended MB, and outputting it to the variable length coding unit 108 and the prediction residual reconstruction unit 109.

(过程S177)在预测残差重建部109中,将与由被输入的变换选择标志规定的频率变换对应的逆频率变换应用于被输入的变换系数,从而重建与处理对象扩展MB对应的预测残差,并输出到局部解码图像生成部110。(Process S177) In the prediction residual reconstruction unit 109, the inverse frequency transform corresponding to the frequency transform specified by the input transform selection flag is applied to the input transform coefficient, thereby reconstructing the prediction residual corresponding to the processing object extended MB and outputting it to the local decoded image generation unit 110.

(过程S178)在局部解码图像生成部110中,根据被输入的预测残差和预测图像,生成局部解码图像并输出到帧存储器101进行记录。(Process S178) The local decoded image generation unit 110 generates a local decoded image based on the input prediction residual and the predicted image, and outputs the local decoded image to the frame memory 101 for storage.

(过程S179)在可变长编码部113中,利用被输入的变换制约,对被输入的变换系数以及预测参数以及变换选择标志进行可变长编码,并作为编码数据输出到外部。(Process S179) In the variable-length coding unit 113, the input transform coefficients, prediction parameters, and transform selection flag are variable-length coded using the input transform constraints, and are output to the outside as coded data.

通过上述过程,在运动图像编码装置11中,可以对被输入的输入运动图像进行编码从而生成编码数据并输出到外部。Through the above-described process, the moving picture encoding device 11 can encode the input moving picture to generate encoded data and output it to the outside.

<变换候补列表生成方法的其他例><Other Examples of Conversion Candidate List Generation Method>

另外,通过关于上述变换候补导出部111的说明,示出了变换候补列表生成方法的一例,但是也可以利用其他方法来生成变换候补列表。例如,在变换预置集之中包括两个处于相似关系的频率变换DCTa、DCTb(其中,DCTa的变换尺寸比DCTb的变换尺寸大)时,在对上位阶层中包含的分区的变换候补列表中追加DCTa而不追加DCTb,在对下位阶层中包含的分区的变换候补列表中追加DCTb,这样的变换候补列表生成方法也是有效的。更具体而言,在变换预置集之中包括16×16DCT和8×8DCT的情况下,在对以64×64像素为处理单位的阶层L0中包含的分区的变换候补列表中至少追加16×16DCT而不追加8×8DCT,在对以32×32像素为处理单位的阶层L1中包含的分区的变换候补列表中至少追加8×8DCT。While the description of the transform candidate derivation unit 111 above illustrates an example of a transform candidate list generation method, other methods may also be used to generate a transform candidate list. For example, when the transform preset set includes two similar frequency transforms, DCTa and DCTb (where DCTa's transform size is larger than DCTb's), a transform candidate list generation method is also effective in which DCTa is added to the transform candidate list for partitions included in the upper hierarchy, but DCTb is not added, and DCTb is added to the transform candidate list for partitions included in the lower hierarchy. More specifically, when the transform preset set includes both the 16×16 DCT and the 8×8 DCT, at least the 16×16 DCT is added to the transform candidate list for partitions included in hierarchy L0, which processes 64×64 pixels, but not the 8×8 DCT, and at least the 8×8 DCT is added to the transform candidate list for partitions included in hierarchy L1, which processes 32×32 pixels.

即使在属于特定的阶层Lx的分区不能选择特定的频率变换DCTb(例如8×8DCT),若在属于阶层Lx的下位的阶层Ly的分区中能够选择DCTb,则在DCTb有效的区域中,通过不选择属于上位阶层Lx的分区,而选择属于能够选择DCTb的下位阶层Ly的分区,也可以抑制编码数据的码量增加。尤其对于大的分区来说,基于大变换尺寸的频率变换有效这一事实,取代对属于上位阶层Lx的分区禁止DCTb的选择,而是能够选择具有更大的变换尺寸的DCTa(例如16×16DCT),另一方面对于属于下位阶层Ly的分区能够选择DCTb,这是有效的。Even if a specific frequency transform DCTb (e.g., 8×8 DCT) cannot be selected for a partition belonging to a specific layer Lx, if DCTb can be selected for partitions belonging to the layer Ly below layer Lx, then in areas where DCTb is available, the increase in the amount of coded data can be suppressed by selecting partitions belonging to the lower layer Ly where DCTb can be selected, rather than selecting partitions belonging to the upper layer Lx. This is particularly effective for large partitions, as frequency transforms with large transform sizes are available. Instead of prohibiting DCTb selection for partitions belonging to the upper layer Lx, a DCTb with a larger transform size (e.g., 16×16 DCT) can be selected, while allowing DCTb to be selected for partitions belonging to the lower layer Ly.

<运动图像解码装置21的构成><Configuration of Moving Image Decoding Device 21>

下面,说明对由运动图像编码装置11进行了编码的编码数据进行解码从而生成解码运动图像的运动图像解码装置21。Next, a description will be given of the moving picture decoding device 21 that decodes the coded data coded by the moving picture coding device 11 to generate a decoded moving picture.

图14是表示图像解码装置21的构成的框图。运动图像解码装置20包括帧存储器101、预测图像生成部103、预测残差重建部109、局部解码图像生成部110、变换候补导出部111以及可变长符号解码部202。14 is a block diagram showing the configuration of the image decoding device 21 . The moving picture decoding device 20 includes a frame memory 101 , a predicted image generator 103 , a prediction residual reconstruction unit 109 , a local decoded image generator 110 , a transform candidate derivation unit 111 , and a variable length code decoder 202 .

可变长符号解码部202根据被输入的编码数据和变换候补列表,对预测参数、变换选择标志以及变换系数进行解码并进行输出。具体而言,首先,从编码数据对预测参数进行解码并输出。接着,利用变换候补列表,从编码数据解码变换选择标志并输出。最后,利用变换选择标志,从编码数据解码变换系数并输出。另外,在变换选择标志的解码时,需要知道用几比特对变换选择标志进行了编码,但是未必一定需要为此而包含在变换候补列表中的要素的信息,只要知道变换候补列表中包含的要素数就足够。在该情况下,输入到可变长符号解码部202的信号以及用于变换选择标志的解码的信号,也可以在变换候补列表中仅是关于变换候补列表中包含的要素数的信号。The variable-length code decoding unit 202 decodes and outputs prediction parameters, a transform selection flag, and transform coefficients based on the input coded data and the transform candidate list. Specifically, the prediction parameters are first decoded from the coded data and output. Next, the transform selection flag is decoded from the coded data using the transform candidate list and output. Finally, the transform coefficients are decoded from the coded data using the transform selection flag and output. While decoding the transform selection flag requires knowing how many bits were used to encode the transform selection flag, this does not necessarily require information about the elements included in the transform candidate list; simply knowing the number of elements in the transform candidate list is sufficient. In this case, the signal input to the variable-length code decoding unit 202 and the signal used to decode the transform selection flag may simply be information about the number of elements in the transform candidate list.

<运动图像解码装置21的动作><Operation of Moving Image Decoding Device 21>

接着,说明运动图像解码装置21的动作。Next, the operation of the moving picture decoding device 21 will be described.

(过程S180)从外部输入到运动图像解码装置20的编码数据以扩展MB为单位依次输入到可变长符号解码部201,对与各扩展MB对应的编码数据,依次执行以后的S181~S187的处理。(Process S180) The coded data input from the outside to the moving picture decoding device 20 is sequentially input to the variable length code decoding unit 201 in units of extended MBs, and the subsequent processes S181 to S187 are sequentially performed on the coded data corresponding to each extended MB.

(过程S181)在可变长符号解码部202中,从被输入的编码数据解码与处理对象扩展MB对应的预测参数,并输出到预测图像生成部103以及变换候补导出部111。(Process S181) The variable-length code decoding unit 202 decodes the prediction parameters corresponding to the processing target extended MB from the input coded data and outputs the decoded prediction parameters to the predicted image generating unit 103 and the transform candidate deriving unit 111.

(过程S182)在变换候补导出部111中,根据被输入的预测参数,导出处理对象扩展MB的各分区中的变换候补列表,并输出到可变长符号解码部202。(Process S182) The transform candidate derivation unit 111 derives a transform candidate list for each partition of the processing target extended MB based on the input prediction parameters, and outputs the list to the variable-length code decoding unit 202.

(过程S183)在可变长符号解码部202中,根据被输入的编码数据和变换制约,解码与处理对象MB对应的变换选择标志,并输出到预测残差重建部109。(Process S183) In the variable-length code decoding unit 202, the transform selection flag corresponding to the processing target MB is decoded based on the input coded data and transform constraints, and output to the prediction residual reconstruction unit 109.

(过程S184)在可变长符号解码部202中,根据被输入的编码数据和在(过程S183)导出的变换选择标志,解码与处理对象扩展MB对应的变换系数,并输出到预测残差重建部109。(Process S184) In the variable length code decoding unit 202, based on the input coded data and the transform selection flag derived in (Process S183), the transform coefficient corresponding to the processing target extended MB is decoded and output to the prediction residual reconstruction unit 109.

(过程S185)在预测图像生成部103中,根据被输入的预测参数以及记录在帧存储器101中的局部解码图像,生成与处理对象扩展MB对应的预测图像,并输出到局部解码图像生成部110。(Process S185) The predicted image generator 103 generates a predicted image corresponding to the processing target extended MB based on the input prediction parameters and the local decoded image stored in the frame memory 101, and outputs the predicted image to the local decoded image generator 110.

(过程S186)在预测残差重建部109中,将与由被输入的变换选择标志规定的频率变换对应的逆频率变换应用于被输入的变换系数,重建与处理对象扩展MB对应的预测残差,并输出到局部解码图像生成部110。(Process S186) In the prediction residual reconstruction unit 109, the inverse frequency transform corresponding to the frequency transform specified by the input transform selection flag is applied to the input transform coefficient, and the prediction residual corresponding to the processing target extended MB is reconstructed and output to the local decoded image generation unit 110.

(过程S187)在局部解码图像生成部110中,根据被输入的预测残差和预测图像,生成局部解码图像并输出到帧存储器101进行记录,并且作为与处理对象块对应的解码运动图像上的区域而输出到外部。(Process S187) In the local decoded image generation unit 110, a local decoded image is generated based on the input prediction residual and the prediction image, and is output to the frame memory 101 for storage, and is output to the outside as an area on the decoded moving image corresponding to the processing target block.

<解码器总结><Decoder Summary>

如以上说明的那样,根据运动图像解码装置21,能够从由运动图像编码装置11生成的编码数据生成解码运动图像。As described above, the moving picture decoding device 21 can generate a decoded moving picture from the encoded data generated by the moving picture encoding device 11 .

(实施方式3)(Implementation 3)

下面,参照图15~图16来说明作为本发明的运动图像编码装置以及运动图像解码装置的另一实施方式的运动图像编码装置30以及运动图像解码装置40。另外,在附图的说明中,对同一要素标注同一符号并省略说明。此外,在运动图像编码装置30以及运动图像解码装置40中能够利用的分区结构、变换预置集与运动图像编码装置11以及运动图像解码装置21中所使用的相同。Next, referring to Figures 15 and 16 , a moving picture encoding device 30 and a moving picture decoding device 40, which are other embodiments of the moving picture encoding device and moving picture decoding device according to the present invention, will be described. In the description of the figures, identical elements are denoted by identical reference numerals, and their description will be omitted. Furthermore, the partition structures and transform presets usable in the moving picture encoding device 30 and the moving picture decoding device 40 are the same as those used in the moving picture encoding device 11 and the moving picture decoding device 21.

在本实施方式中的运动图像编码装置30、运动图像解码装置40中,与运动图像编码装置11、运动图像解码装置21的不同点是具备如下功能:利用运动图像的场景(scene)、帧、片这样的比MB大的给定单位,与运动图像的性质相匹配来适应性地改变变换候补导出部的变换候补列表的导出方法。The motion image encoding device 30 and the motion image decoding device 40 in this embodiment are different from the motion image encoding device 11 and the motion image decoding device 21 in that they have the following function: using given units larger than MB, such as scenes, frames, and slices of motion images, to adaptively change the derivation method of the transformation candidate list of the transformation candidate derivation unit to match the nature of the motion images.

图15是表示运动图像编码装置30的构成的框图。运动图像编码装置30包括帧存储器101、预测参数决定部102、预测图像生成部103、预测残差生成部106、变换系数生成部107、预测残差重建部109、局部解码图像生成部110、频率变换决定部112、变换候补列表导出规则决定部301、变换候补导出部302、可变长编码部303。FIG15 is a block diagram showing the configuration of the moving picture coding apparatus 30. The moving picture coding apparatus 30 includes a frame memory 101, a prediction parameter determination unit 102, a prediction image generation unit 103, a prediction residual generation unit 106, a transform coefficient generation unit 107, a prediction residual reconstruction unit 109, a local decoded image generation unit 110, a frequency transform determination unit 112, a transform candidate list derivation rule determination unit 301, a transform candidate derivation unit 302, and a variable length coding unit 303.

变换候补列表导出规则决定部301根据以场景、帧、片等比MB大的给定单位输入的输入运动图像,生成用于规定或者更新变换候补导出部中的变换候补列表导出方法的变换候补列表导出规则。另外,以下,为了简化说明,设按每个帧来生成变换候补列表导出规则。The conversion candidate list derivation rule determination unit 301 generates conversion candidate list derivation rules for defining or updating the conversion candidate list derivation method used by the conversion candidate derivation unit based on an input moving image input in predetermined units larger than MB, such as scenes, frames, or slices. For simplicity of explanation, the conversion candidate list derivation rules are generated for each frame.

(变换候补列表导出规则的定义)(Definition of the conversion candidate list derivation rule)

变换候补列表导出规则,定义为以下所示的基础规则的组合。The conversion candidate list derivation rules are defined as a combination of the following basic rules.

·基础规则1:规定针对给定分区A,向变换候补列表追加变换预置集内的给定频率变换B。另外,以下,利用[许可、分区A、频率变换B]这样的形式来记载基础规则1。例如,[许可、64×64、T16×16]表示对64×64的分区向变换候补列表追加T16×16的频率变换。Basic Rule 1: Specifies that for a given partition A, a given frequency transform B in the preset transform set be added to the transform candidate list. Hereinafter, Basic Rule 1 is described using the format [permit, partition A, frequency transform B]. For example, [permit, 64×64, T16×16] indicates that a T16×16 frequency transform is added to the transform candidate list for a 64×64 partition.

·基础规则2:规定针对给定分区A,禁止将变换预置集内的给定频率变换B包含到变换候补列表。另外,以下,利用[禁止、分区A、频率变换B]这样的形式来记载基础规则2。例如,[禁止、64×64、T4×4]表示对64×64的大小分区禁止T4×4,在变换候补列表中不包含。Basic Rule 2: For a given partition A, a given frequency transform B within the transform preset set is prohibited from being included in the transform candidate list. Below, Basic Rule 2 is described using the format [prohibit, partition A, frequency transform B]. For example, [prohibit, 64×64, T4×4] prohibits T4×4 from being included in the transform candidate list for a 64×64 partition.

·基础规则3:规定对于给定分区A在变换候补列表中包含变换预置集内的给定频率变换B时,用其他频率变换C置换变换候补列表中的频率变换B。另外,以下,利用[置换、分区A、频率变换B、频率变换C]这样的形式来记载基础规则3。例如,[置换、64×32、T4×4、T16×1]表示对于64×32的大小的分区在变换候补列表中包含T4×4时,从变换候补列表去除T4×4,而将T16×1加入到变换候补列表中。Basic Rule 3: When a given frequency transform B in the transform preset set is included in the transform candidate list for a given partition A, frequency transform B in the transform candidate list is replaced with another frequency transform C. Furthermore, Basic Rule 3 is described below using the format [replace, partition A, frequency transform B, frequency transform C]. For example, [replace, 64×32, T4×4, T16×1] indicates that when T4×4 is included in the transform candidate list for a 64×32 partition, T4×4 is removed from the transform candidate list and T16×1 is added to the transform candidate list.

即,在变换候补列表导出规则中含有多个基础规则,各基础规则被分类为上述基础规则1~3的任一种。That is, the conversion candidate list derivation rule includes a plurality of basic rules, and each basic rule is classified into any one of the basic rules 1 to 3 described above.

另外,在变换候补列表导出规则中,除了基础规则之外,还可以包含由基础规则的组合表现的复合规则,或者用复合规则代替基础规则。以下,列举几个复合规则的示例。Furthermore, in addition to basic rules, the conversion candidate list derivation rules may also include compound rules represented by a combination of basic rules, or may be used in place of basic rules. Several examples of compound rules are listed below.

·复合规则1:在属于特定的阶层的分区禁止特定的变换。例如,在L0阶层禁止T8×8以下的大小的变换,这一规则相当于该复合规则1。上述复合规则(R1)可以表现为如下的基础规则的集合。Compound Rule 1: Prohibit specific transformations in partitions belonging to a specific hierarchy. For example, prohibiting transformations of sizes smaller than T8×8 in the L0 hierarchy corresponds to Compound Rule 1. The compound rule (R1) can be expressed as a set of the following basic rules.

R1={[禁止、P、T]:(P是属于L0阶层的分区)∧(T是T8×8以下的频率变换)}R1 = {[Prohibit, P, T]: (P is a partition belonging to the L0 layer) ∧ (T is a frequency conversion of T8×8 or less)}

此外,禁止在给定阶层的上位阶层处于类似关系的频率变换中的尺寸小的频率变换,这一规则也相当于该复合规则1,更具体而言,禁止在阶层L1的上位阶层处于类似关系的T16×16、T8×8、T4×4的各变换中的T8×8以及T4×4,这样的规则也相当于该复合规则1。In addition, the rule that prohibits small-sized frequency transformations among frequency transformations that are in a similar relationship at the upper level of a given layer is also equivalent to the compound rule 1. More specifically, the rule that prohibits T8×8 and T4×4 among the transformations of T16×16, T8×8, and T4×4 that are in a similar relationship at the upper level of layer L1 is also equivalent to the compound rule 1.

·复合规则2:在特定形状的分区中将特定的变换A置换为特定的变换B。例如,在正方形的分区中将长方形的频率变换置换为给定正方形的频率变换(例如T4×4),这样的规则相当于该复合规则2。上述复合规则(R2)可以表现为如下的基础规则的集合。Compound Rule 2: In a partition of a specific shape, a specific transformation A is replaced with a specific transformation B. For example, in a square partition, a rule that replaces a rectangular frequency transformation with a given square frequency transformation (e.g., T4×4) corresponds to Compound Rule 2. The compound rule (R2) can be expressed as a set of the following basic rules.

R2={[置换、P、T、T4×4]:(P∈正方形的分区)∧(T∈长方形的频率变换)}R2 = {[permutation, P, T, T4×4]: (P∈square partition)∧(T∈rectangular frequency transformation)}

此外,在横长方形的分区中将正方形的频率变换置换为横长方形的频率变换,这样的规则也相当于该复合规则2。Furthermore, a rule that replaces square frequency transformation with horizontally rectangular frequency transformation in horizontally rectangular partitions also corresponds to the composite rule 2.

(变换候补列表导出规则的决定过程)(Determination process of conversion candidate list derivation rule)

首先,在编码处理开始前预先规定以基础规则以及复合规则为构成要素的规则候补,将变换候补列表导出规则设定为空。然后,对被输入的各帧,计算分别应用在规则候补中包含的各基础规则或者复合规则而进行编码处理时的率失真成本。此外,还计算不应用全部规则候补时的率失真成本C1。接着,比较应用了各基础规则或者复合规则时的率失真成本C2和成本C1,若成本C2比成本C1小,则决定应用该基础规则或者复合规则,将其包含在变换候补列表导出规则中。First, before the start of encoding, rule candidates consisting of basic rules and composite rules are predefined, and the candidate transformation list derivation rules are set to empty. Next, for each input frame, the rate-distortion cost is calculated when encoding is performed using each basic rule or composite rule included in the candidate rules. Furthermore, the rate-distortion cost C1 is calculated when none of the candidate rules are applied. Next, the rate-distortion cost C2 when each basic rule or composite rule is applied is compared with cost C1. If cost C2 is less than cost C1, the basic rule or composite rule is applied and included in the candidate transformation list derivation rules.

通过上述过程,仅将给定规则候补中的、通过在帧的编码时进行应用而能够降低率失真成本的基础规则或者复合规则追加到变换候补列表导出规则。Through the above-described process, only basic rules or composite rules that can reduce the rate-distortion cost by being applied during frame encoding among given rule candidates are added to the transformation candidate list derivation rules.

变换候补导出部302根据被输入的预测参数和变换候补列表导出规则,将关于在扩展MB内的各分区中能够选择的频率变换的信息作为变换候补列表而输出。变换候补列表分别与扩展MB内的各分区建立对应,在变换预置集之中包含的频率变换中,规定在各分区能够选择的频率变换的集合。另外,此时,被输入的变换候补列表导出规则也用于变换候补列表导出处理。Based on the input prediction parameters and the transform candidate list derivation rule, the transform candidate derivation unit 302 outputs information regarding the frequency transforms selectable in each partition within the extended MB as a transform candidate list. The transform candidate list is associated with each partition within the extended MB and specifies the set of frequency transforms selectable in each partition from among the frequency transforms included in the transform preset set. The input transform candidate list derivation rule is also used in the transform candidate list derivation process.

根据被输入的变换候补列表导出规则,生成针对特定的分区p的变换候补列表Cp的过程如下所述。另外,设分区p的大小为M×N像素(横M像素、纵N像素)。The process of generating the conversion candidate list Cp for a specific partition p based on the input conversion candidate list derivation rule is as follows: It is assumed that the size of the partition p is M×N pixels (M pixels horizontally and N pixels vertically).

(过程S200)在变换候补列表导出规则中包含复合规则时,将各复合规则分解为基础规则之后追加到变换候补列表导出规则中。(Process S200) When the conversion candidate list derivation rules include compound rules, each compound rule is decomposed into basic rules and then added to the conversion candidate list derivation rules.

(过程S201)对于变换候补列表中包含的全部属于基础规则1的基础规则,执行过程S202的处理。(Process S201) The process of process S202 is executed for all basic rules belonging to basic rule 1 included in the conversion candidate list.

(过程S202)将处理对象的基础规则1表示为[许可、P1、T1]。在分区p的形状与P1一致时,向变换候补列表追加频率变换T1。(Step S202) The basic rule 1 to be processed is expressed as [permission, P1, T1]. If the shape of the partition p matches P1, the frequency transformation T1 is added to the transformation candidate list.

(过程S203)对于变换候补列表中包含的全部属于基础规则2的基础规则,执行过程S204的处理。(Step S203) The process of step S204 is executed for all the basic rules belonging to basic rule 2 included in the conversion candidate list.

(过程S204)将处理对象的基础规则2表示为[禁止、P2、T2]。在分区p的形状与P2一致、并且在变换候补列表中存在频率变换T2时,从变换候补列表去除频率变换T2。(Step S204) The basic rule 2 to be processed is expressed as [prohibit, P2, T2]. If the shape of the partition p matches P2 and the frequency transformation T2 is present in the transformation candidate list, the frequency transformation T2 is removed from the transformation candidate list.

(过程S205)对于变换候补列表中包含的全部属于基础规则3的基础规则,执行过程S206的处理。(Step S205) The process of step S206 is executed for all the basic rules belonging to the basic rule 3 included in the conversion candidate list.

(过程S206)将处理对象的基础规则2表示为[置换、P3、T3、T4]。在分区p的形状与P3一致、并且在变换候补列表中存在频率变换T3时,将频率变换T3置换为频率变换T4。(Step S206) The basic rule 2 to be processed is expressed as [Permutation, P3, T3, T4]. If the shape of the partition p matches P3 and the frequency transformation T3 exists in the transformation candidate list, the frequency transformation T3 is replaced by the frequency transformation T4.

通过以上的过程,在变换候补导出部302中,根据被输入的变换候补列表导出规则可以导出变换候补列表。Through the above process, the conversion candidate derivation unit 302 can derive a conversion candidate list based on the input conversion candidate list derivation rule.

可变长编码部303生成并输出分别与被输入的变换系数、预测参数、变换候补列表、变换选择标志和变换候补列表导出规则对应的编码数据。The variable-length coding unit 303 generates and outputs coded data corresponding to the input transform coefficients, prediction parameters, transform candidate lists, transform selection flags, and transform candidate list derivation rules.

说明与变换候补列表导出规则对应的编码数据的生成处理的详细情况。编码数据通过对变换候补列表导出规则中包含的各基础规则或者复合规则进行可变长编码而生成。在基础规则的可变长编码中,首先对表示对象的基础规则被分类为基础规则1~3的哪一种的信息进行编码,接着对表示作为基础规则的适合对象的分区的信息进行编码。最后,在基础规则1的情况下,对表示要许可的频率变换的种类的信息进行编码;在基础规则2的情况下,对表示要禁止的频率变换的种类的信息进行编码;在基础规则3的情况下,对表示置换前后的各频率变换的种类的信息进行编码。另外,在预先决定了可以在变换候补列表导出列表中包含哪种基础规则的情况下,代替用上述方法对基础规则进行可变长编码,通过将表示是否适用基础规则的信息作为编码数据,能够削减码量。另外,在预先决定了总是应用特定的基础规则的情况下,不需要对该基础规则进行可变长编码。The following describes the details of the process for generating coded data corresponding to the transformation candidate list derivation rules. The coded data is generated by variable-length encoding each base rule or composite rule included in the transformation candidate list derivation rules. In variable-length encoding of the base rules, information indicating whether the underlying base rule is classified as one of basic rules 1 to 3 is first encoded. Next, information indicating the partitions for which the underlying base rule applies is encoded. Finally, for basic rule 1, information indicating the types of frequency transformations to be permitted is encoded; for basic rule 2, information indicating the types of frequency transformations to be prohibited is encoded; and for basic rule 3, information indicating the types of frequency transformations before and after substitution is encoded. Furthermore, if the types of basic rules that can be included in the transformation candidate list derivation list are predetermined, the amount of code can be reduced by using information indicating whether the basic rule applies as coded data, rather than using the aforementioned method for variable-length encoding of the basic rules. Furthermore, if a specific basic rule is predetermined to always apply, variable-length encoding of that basic rule is unnecessary.

复合规则被分解为基础规则来进行编码。此外,在预先决定了可以在变换候补列表导出列表中包含哪种复合规则的情况下,通过将表示是否适用复合规则的信息作为编码数据,能够削减码量。例如,在比32×32大的分区中能够将禁止T4×4以及T8×8这样的复合规则的适用有无作为1比特的标志来进行编码。Composite rules are decomposed into basic rules and then encoded. Furthermore, if the types of composite rules that can be included in the transformation candidate list derivation list are predetermined, encoding information indicating whether the composite rule applies can reduce the amount of code. For example, in partitions larger than 32×32, the prohibition of applying composite rules such as T4×4 and T8×8 can be encoded as a 1-bit flag.

此外,可以将特定的基础规则或者复合规则汇总到一起来规定规则组,并且对针对该规则组中包含的各个基础规则表示适用有无的信息进行编码,或者对表示是否用既定的方法估计该规则组中包含的全部基础规则的适用有无的标志进行编码。具体而言,在将在阶层L3中表示T16×16、T8×8、T4×4各自的适用有无的复合规则设为enable_t16×16_L3、enable_t16×16_L3、enable_t16×16_L3时,汇总所述3个解码规则来创建规则组enable_L3。在编码时,首先用1比特编码enable_L3的适用有无,在适用enable_L3时,分别用1比特对该规则组中包含的各复合规则的适用有无进行编码。在不适用enable_L3时,用既定的方法估计各复合规则的适用有无。Furthermore, specific basic rules or composite rules can be grouped together to define a rule group, and information indicating whether each basic rule included in the rule group is applicable can be encoded, or a flag indicating whether the applicable application of all basic rules included in the rule group is estimated using a predetermined method can be encoded. Specifically, if the composite rules indicating the applicable application of T16×16, T8×8, and T4×4 in layer L3 are set to enable_t16×16_L3, enable_t16×16_L3, and enable_t16×16_L3, respectively, these three decoding rules are combined to create the rule group enable_L3. During encoding, the applicable application of enable_L3 is first encoded with one bit. When enable_L3 is applied, the applicable application of each composite rule included in the rule group is encoded with one bit. When enable_L3 is not applied, the applicable application of each composite rule is estimated using a predetermined method.

另外,可以将基础规则以及复合规则一起进行编码,而不是分别进行可变长编码。例如,可以对表示不应用全部基础规则、或者至少应用1个基础规则的标志进行编码,并且仅在该标志表示至少应用1个基础规则时,对按每个基础规则表示是否要应用的信息进行编码。此外,可以对表示是否继承在前帧应用的变换候补列表导出规则的标志进行编码,仅在不继承时才对变换候补列表导出规则进行编码。Furthermore, the basic rules and composite rules can be encoded together rather than as separate variable-length codes. For example, a flag indicating that all basic rules are not applied or that at least one basic rule is applied can be encoded, and only when the flag indicates that at least one basic rule is applied, information indicating whether each basic rule is applied is encoded. Furthermore, a flag indicating whether to inherit the candidate transformation list derivation rule applied in the previous frame can be encoded, and only when this flag indicates that the candidate transformation list derivation rule is not inherited is the candidate transformation list derivation rule encoded.

接着,对运动图像编码装置30的动作进行说明。Next, the operation of the moving picture encoding device 30 will be described.

(过程S210)从外部输入到运动图像编码装置30的输入运动图像,以帧为单位输入到变换候补列表导出规则决定部301,并且以扩展MB为单位依次输入到预测参数决定部102以及预测残差生成部106。对各帧执行过程S211~S212的处理,对各扩展MB执行过程S213~S221的处理。(Process S210) An input moving picture input from the outside to the moving picture coding device 30 is input to the transform candidate list derivation rule determination unit 301 on a frame-by-frame basis, and is then sequentially input to the prediction parameter determination unit 102 and the prediction residual generation unit 106 on an extended MB basis. Processes S211 and S212 are executed for each frame, and processes S213 to S221 are executed for each extended MB.

(过程S211)在变换候补列表导出规则决定部301中,根据输入帧,生成变换候补列表导出规则,并输出到变换候补导出部302以及可变长编码部303。(Process S211) The transform candidate list derivation rule determination unit 301 generates a transform candidate list derivation rule based on the input frame and outputs the rule to the transform candidate derivation unit 302 and the variable length coding unit 303.

(过程S212)在可变长编码部303中,根据被输入的变换候补列表导出规则生成对应的编码数据并输出到外部。(Process S212) In the variable-length coding unit 303, corresponding coded data is generated based on the input transformation candidate list derivation rule and output to the outside.

(过程S213)在预测参数决定部102中,针对处理对象扩展MB,根据被输入的输入运动图像决定预测参数,并输出到预测图像生成部103、变换候补导出部302、以及可变长编码部303。(Process S213) The prediction parameter determination unit 102 determines prediction parameters for the processing target extended MB based on the input moving picture, and outputs the prediction parameters to the prediction image generation unit 103, the transform candidate derivation unit 302, and the variable length coding unit 303.

(过程S214)在预测图像生成部103中,根据被输入的预测参数以及记录在帧存储器101中的局部解码图像,生成与输入运动图像中的处理对象扩展MB的区域近似的预测图像并输出到预测残差生成部106以及局部解码图像生成部110。(Process S214) In the prediction image generation unit 103, based on the input prediction parameters and the local decoded image recorded in the frame memory 101, a prediction image that is approximate to the area of the processing object extended MB in the input motion image is generated and output to the prediction residual generation unit 106 and the local decoded image generation unit 110.

(过程S215)在预测残差生成部106中,根据被输入的输入运动图像和预测图像,生成与处理对象扩展MB对应的预测残差,并输出到频率变换决定部112以及变换系数生成部107。(Process S215 ) The prediction residual generation unit 106 generates a prediction residual corresponding to the extended MB to be processed based on the input moving picture and the predicted picture, and outputs it to the frequency transformation determination unit 112 and the transform coefficient generation unit 107 .

(过程S216)在变换候补导出部302中,根据被输入的预测参数以及变换候补列表导出规则,导出关于处理对象扩展MB的各分区中的频率变换的制约,并输出到频率变换决定部112以及可变长编码部303。(Process S216) The transform candidate derivation unit 302 derives constraints on frequency transform in each partition of the processing target extended MB based on the input prediction parameters and transform candidate list derivation rule, and outputs the constraints to the frequency transform determination unit 112 and the variable length coding unit 303.

(过程S217)在频率变换决定部112中,根据被输入的变换制约和预测残差,决定应用于处理对象扩展MB的各分区的频率变换,并作为变换选择标志输出到变换系数生成部107以及可变长编码部303以及预测残差重建部109。(Process S217) In the frequency transformation determination unit 112, the frequency transformation applied to each partition of the processing object extended MB is determined based on the input transformation constraints and prediction residuals, and is output as a transformation selection flag to the transformation coefficient generation unit 107, the variable length coding unit 303, and the prediction residual reconstruction unit 109.

(过程S218)在变换系数生成部107中,将由被输入的变换选择标志规定的频率变换应用于被输入的预测残差,生成与处理对象扩展MB对应的变换系数,并输出到可变长编码部108以及预测残差重建部109。(Process S218) In the transform coefficient generation unit 107, the frequency transform specified by the input transform selection flag is applied to the input prediction residual, and the transform coefficient corresponding to the processing target extended MB is generated and output to the variable length coding unit 108 and the prediction residual reconstruction unit 109.

(过程S219)在预测残差重建部109中,将与由被输入的变换选择标志规定的频率变换对应的逆频率变换应用于被输入的变换系数,从而重建与处理对象扩展MB对应的预测残差,并输出到局部解码图像生成部110。(Process S219) In the prediction residual reconstruction unit 109, the inverse frequency transform corresponding to the frequency transform specified by the input transform selection flag is applied to the input transform coefficient, thereby reconstructing the prediction residual corresponding to the processing object extended MB and outputting it to the local decoded image generation unit 110.

(过程S220)在局部解码图像生成部110中,根据被输入的预测残差和预测图像生成局部解码图像并输出到帧存储器101进行记录。(Process S220) In the local decoded image generation unit 110, a local decoded image is generated based on the input prediction residual and the predicted image, and is output to the frame memory 101 for storage.

(过程S221)在可变长编码部303中,利用被输入的变换制约,对被输入的变换系数以及预测参数以及变换选择标志进行可变长编码,作为编码数据输出到外部。(Process S221) In the variable-length coding unit 303, the input transform coefficients, prediction parameters, and transform selection flag are variable-length coded using the input transform constraints, and are output to the outside as coded data.

通过上述过程,在运动图像编码装置30中,能够对被输入的输入运动图像进行编码从而生成编码数据并输出到外部。Through the above-described process, the moving picture encoding device 30 can encode the input moving picture to generate encoded data and output it to the outside.

<运动图像解码装置40的构成><Configuration of Moving Image Decoding Device 40>

下面,说明对由运动图像编码装置30进行了编码的编码数据进行解码从而生成解码运动图像的运动图像解码装置40。Next, a description will be given of a moving picture decoding device 40 that decodes the coded data coded by the moving picture coding device 30 to generate a decoded moving picture.

图16是表示图像解码装置40的构成的框图。运动图像解码装置40包括帧存储器101、预测图像生成部103、预测残差重建部109、局部解码图像生成部110、变换候补导出部302以及可变长符号解码部401。16 is a block diagram showing the configuration of the image decoding device 40 . The moving picture decoding device 40 includes a frame memory 101 , a predicted image generator 103 , a prediction residual reconstruction unit 109 , a local decoded image generator 110 , a transform candidate derivation unit 302 , and a variable length code decoder 401 .

可变长符号解码部401根据被输入的编码数据和变换候补列表,对预测参数、变换选择标志、变换系数、以及变换候补列表导出规则进行解码并输出。具体而言,首先,对变换候补列表导出规则进行解码并输出。接着,从编码数据解码预测参数并输出。接着,利用变换候补列表,从编码数据解码变换选择标志并输出。最后,利用变换选择标志,从编码数据解码变换系数并输出。The variable-length code decoding unit 401 decodes and outputs prediction parameters, a transform selection flag, transform coefficients, and a transform candidate list derivation rule based on the input coded data and transform candidate list. Specifically, the transform candidate list derivation rule is first decoded and output. Next, the prediction parameters are decoded from the coded data and output. Next, the transform selection flag is decoded from the coded data using the transform candidate list and output. Finally, the transform coefficients are decoded from the coded data using the transform selection flag and output.

<运动图像解码装置40的动作><Operation of Video Decoding Device 40>

接着,对运动图像解码装置40的动作进行说明。Next, the operation of the moving picture decoding device 40 will be described.

(过程S230)从外部输入到运动图像解码装置40的编码数据,以帧为单位依次输入到可变长符号解码部401,对与各帧对应的编码数据,依次执行以后的S231~S239的处理。(Process S230) The coded data input from the outside to the motion picture decoding device 40 is sequentially input to the variable length code decoding unit 401 in units of frames, and the subsequent processes S231 to S239 are sequentially performed on the coded data corresponding to each frame.

(过程S231)在可变长符号解码部401,从被输入的编码数据,解码与处理对象帧对应的变换候补列表导出规则,并输出到变换候补导出部302。(Process S231) The variable-length code decoding unit 401 decodes the conversion candidate list derivation rule corresponding to the processing target frame from the input coded data, and outputs it to the conversion candidate derivation unit 302.

(过程S232)在可变长符号解码部401中,将被输入的帧单位的编码数据分割为扩展MB单位的编码数据,并对与各扩展MB对应的编码数据依次执行以后的S233~S239的处理。(Process S232) In the variable-length code decoding unit 401, the input frame-based coded data is divided into extended MB-based coded data, and the subsequent processes of S233 to S239 are sequentially performed on the coded data corresponding to each extended MB.

(过程S233)在可变长符号解码部401中,从作为处理对象的扩展MB单位的编码数据解码预测参数,并输出到变换候补导出部302。(Process S233) The variable-length code decoding unit 401 decodes prediction parameters from the coded data in the extended MB unit to be processed, and outputs the decoded prediction parameters to the transform candidate derivation unit 302.

(过程S234)在变换候补导出部302中,根据被输入的变换候补列表导出规则以及预测参数,导出处理对象扩展MB的各分区中的变换候补列表,并输出到可变长符号解码部401。(Process S234) The transform candidate derivation unit 302 derives a transform candidate list for each partition of the processing target extended MB based on the input transform candidate list derivation rule and prediction parameters, and outputs the list to the variable-length code decoding unit 401.

(过程S235)在可变长符号解码部401中,根据被输入的编码数据和变换制约,解码与处理对象MB对应的变换选择标志,并输出到预测残差重建部109。(Process S235) In the variable-length code decoding unit 401, the transform selection flag corresponding to the processing target MB is decoded based on the input coded data and transform constraints, and output to the prediction residual reconstruction unit 109.

(过程S236)在可变长符号解码部202中,根据被输入的编码数据和在(过程S235)导出的变换选择标志,解码与处理对象扩展MB对应的变换系数,并输出到预测残差重建部109。(Process S236) In the variable length code decoding unit 202, based on the input coded data and the transform selection flag derived in (Process S235), the transform coefficient corresponding to the processing target extended MB is decoded and output to the prediction residual reconstruction unit 109.

(过程S237)在预测图像生成部103中,根据被输入的预测参数以及记录在帧存储器101中的局部解码图像,生成与处理对象扩展MB对应的预测图像,并输出到局部解码图像生成部110。(Process S237 ) The predicted image generator 103 generates a predicted image corresponding to the processing target extended MB based on the input prediction parameters and the local decoded image stored in the frame memory 101 , and outputs the predicted image to the local decoded image generator 110 .

(过程S238)在预测残差重建部109中,将与由被输入的变换选择标志规定的频率变换对应的逆频率变换应用于被输入的变换系数,从而重建与处理对象扩展MB对应的预测残差,并输出到局部解码图像生成部110。(Process S238) In the prediction residual reconstruction unit 109, the inverse frequency transform corresponding to the frequency transform specified by the input transform selection flag is applied to the input transform coefficient, thereby reconstructing the prediction residual corresponding to the processing object extended MB and outputting it to the local decoded image generation unit 110.

(过程S239)在局部解码图像生成部110中,根据被输入的预测残差和预测图像来生成局部解码图像并输出到帧存储器101进行记录,并且作为与处理对象块对应的解码运动图像上的区域而输出到外部。(Process S239) The local decoded image generator 110 generates a local decoded image based on the input prediction residual and the prediction image, outputs it to the frame memory 101 for storage, and outputs it to the outside as a region on the decoded moving image corresponding to the processing target block.

如以上说明的那样,根据运动图像解码装置40,能够从由运动图像编码装置11生成的编码数据生成解码运动图像。As described above, the moving picture decoding device 40 can generate a decoded moving picture from the encoded data generated by the moving picture encoding device 11 .

此外,典型地,可以将上述实施方式中的运动图像编码装置以及运动图像解码装置的一部分、或者全部作为集成电路即LSI(Large Scale Integration,大规模集成电路)来实施。可以将运动图像编码装置以及运动图像解码装置的各功能块个别地芯片化,也可以集成一部分或者全部来芯片化。此外,集成电路化的手法不局限于LSI,也可以利用专用电路或者通用处理器来实现。此外,根据半导体技术的进步,在出现了代替LSI的集成电路化的技术时,可以采用基于该技术的集成电路。Furthermore, typically, part or all of the moving picture encoding device and moving picture decoding device in the above-described embodiments can be implemented as an integrated circuit, i.e., an LSI (Large Scale Integration). Each functional block of the moving picture encoding device and moving picture decoding device can be individually integrated into a chip, or some or all of the functional blocks can be integrated into a chip. Furthermore, integrated circuit implementation is not limited to LSI; dedicated circuits or general-purpose processors can also be used. Furthermore, as semiconductor technology advances, if an integrated circuit technology that replaces LSI becomes available, integrated circuits based on that technology can be employed.

符号说明Explanation of symbols

10…运动图像编码装置、11…运动图像编码装置、20…运动图像解码装置、21…运动图像解码装置、30…运动图像编码装置、40…运动图像解码装置、101…帧存储器、102…预测参数决定部、103…预测图像生成部、104…变换制约导出部、105…频率变换决定部、106…预测残差生成部、107…变换系数生成部、108…可变长编码部、109…预测残差重建部、110…局部解码图像生成部、111…变换候补导出部、112…频率变换决定部、113…可变长编码部、201…可变长符号解码部、202…可变长符号解码部、301…候补列表导出规则决定部、302…变换候补导出部、302…可变长编码部、401…可变长符号解码部。10…Moving image encoding device, 11…Moving image encoding device, 20…Moving image decoding device, 21…Moving image decoding device, 30…Moving image encoding device, 40…Moving image decoding device, 101…Frame memory, 102…Prediction parameter determination unit, 103…Prediction image generation unit, 104…Transform constraint derivation unit, 105…Frequency transform determination unit, 106…Prediction residual generation unit, 107…Transform coefficient generation unit, 108…Variable length coding unit, 109…Prediction residual reconstruction unit, 110…Local decoded image generation unit, 111…Transform candidate derivation unit, 112…Frequency transform determination unit, 113…Variable length coding unit, 201…Variable length code decoding unit, 202…Variable length code decoding unit, 301…Candidate list derivation rule determination unit, 302…Transform candidate derivation unit, 302…Variable length coding unit, 401…Variable length code decoding unit.

Claims (2)

1.一种运动图像解码装置,按块单位对输入编码数据进行解码处理,其特征在于,具备:1. A motion picture decoding device, which decodes input encoded data in block units, characterized in that it comprises: 可变长解码部,其根据所述输入编码数据,对处理对象的块的分区结构进行解码;A variable-length decoding unit decodes the partition structure of the block of the object being processed based on the input encoded data; 预测图像生成部,其以由所述分区结构所规定的分区为单位来生成预测图像;和A prediction image generation unit generates prediction images in units of partitions defined by the partitioning structure; and 变换候补导出部,其根据分区形状信息,决定能够应用的变换的列表即变换候补列表,其中,所述分区形状信息包括:各个分区的分区尺寸、分区尺寸的特征、以及所述分区结构中的阶层中的至少一项,其中,The transformation candidate derivation unit determines a list of applicable transformations, i.e., a transformation candidate list, based on the partition shape information. The partition shape information includes: the partition size of each partition, the characteristics of the partition size, and at least one of the hierarchical structure of the partition. 所述可变长解码部根据所述输入编码数据和所述变换候补列表来解码变换选择标志,并且根据所述变换选择标志来解码所述处理对象的块的变换系数,其中所述变换选择标志从所述变换候补列表中包括的变换中指定变换,The variable-length decoding unit decodes the transform selection flag based on the input encoded data and the transform candidate list, and decodes the transform coefficients of the block to be processed based on the transform selection flag, wherein the transform selection flag specifies the transform from the transform candidate list. 所述运动图像解码装置还具备:The motion image decoding device also includes: 预测残差重建部,其对所述变换系数应用与由所述变换选择标志所规定的变换对应的逆变换,来重建预测残差;和The prediction residual reconstruction unit applies an inverse transform corresponding to the transform specified by the transform selection flag to the transform coefficients to reconstruct the prediction residuals; and 局部解码图像生成部,其根据所述预测图像和所述预测残差来输出与所述处理对象的块对应的解码图像数据;The local decoded image generation unit outputs decoded image data corresponding to the block of the processing object based on the predicted image and the prediction residual; 其中,所述分区结构通过阶层结构来表现,并且规定各分区按照其形状而包含在任一阶层中。The partitioning structure is represented by a hierarchical structure, and each partition is defined as being contained in any hierarchical level according to its shape. 2.一种运动图像编码装置,将输入运动图像分割为给定大小的块,并按块单位进行编码处理,其特征在于,具备:2. A motion picture encoding apparatus, which divides an input motion picture into blocks of a given size and encodes them in blocks, characterized in that it comprises: 预测参数决定部,其决定块的分区结构;The predictor parameter determination unit, and the partitioning structure of its determination block; 预测图像生成部,其以由所述分区结构所规定的分区为单位来生成预测图像;The prediction image generation unit generates prediction images in units of partitions defined by the partitioning structure; 变换系数生成部,其对所述预测图像与输入运动图像的差分即预测残差应用在给定变换预置集之中包含的变换的任一种;The transform coefficient generation unit applies the difference between the predicted image and the input motion image, i.e., the prediction residual, to any one of the transforms included in a given transform preset set. 变换候补导出部,其根据分区形状信息来导出能够应用的变换的列表即变换候补列表,其中,所述分区形状信息包括:各个分区的分区尺寸、分区尺寸的特征、以及所述分区结构中的阶层中的至少一项;The transformation candidate derivation unit derives a list of applicable transformations, i.e., a transformation candidate list, based on the partition shape information. The partition shape information includes: the partition size of each partition, the characteristics of the partition size, and at least one of the hierarchical structure in the partition structure. 频率变换决定部,其针对所述块中的每一个决定变换选择标志,所述变换选择标志表示所述变换候补列表中包括的变换中要对所述块中的预测残差应用的变换;和A frequency transformation determination unit determines a transformation selection flag for each of the blocks, the transformation selection flag indicating the transformation to be applied to the prediction residual in the block from among the transformations included in the transformation candidate list; and 可变长编码部,其根据所述变换候补列表,对所述变换选择标志进行可变长编码;A variable-length encoding unit performs variable-length encoding on the transform selection flag according to the transform candidate list; 其中,所述分区结构通过阶层结构来表现,并且规定各分区按照其形状而包含在任一阶层中,The partitioning structure is represented by a hierarchical structure, and it is stipulated that each partition is contained in any hierarchical level according to its shape. 其中,所述变换预置集包括频率变换,所述频率变换是4×4DCT、8×8DCT、16×16DCT和32×32DCT。The transformation preset set includes frequency transformations, which are 4×4DCT, 8×8DCT, 16×16DCT, and 32×32DCT.
HK15111995.5A 2009-04-08 2015-12-04 Video encoding apparatus and video decoding apparatus HK1211399B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2009-093606 2009-04-08
JP2009093606 2009-04-08
JP2009146509 2009-06-19
JP2009-146509 2009-06-19

Publications (2)

Publication Number Publication Date
HK1211399A1 HK1211399A1 (en) 2016-05-20
HK1211399B true HK1211399B (en) 2019-09-27

Family

ID=

Similar Documents

Publication Publication Date Title
CN102388614B (en) Video encoding device and video decoding device
US10645410B2 (en) Video decoding apparatus
US8194989B2 (en) Method and apparatus for encoding and decoding image using modification of residual block
US9854235B2 (en) Methods and devices for entropy coding in scalable video compression
JP2012518940A (en) Divided block encoding method in video encoding, divided block decoding method in video decoding, and recording medium for realizing the same
CN118381946A (en) Improvements to Boundary Enforced Partitioning
KR101407755B1 (en) Multi-level significance maps for encoding and decoding
TWI533705B (en) Methods and devices for context set selection
HK1211399B (en) Video encoding apparatus and video decoding apparatus
HK1211398B (en) Video encoding apparatus and video decoding apparatus
HK1211400B (en) Video encoding apparatus and video decoding apparatus
HK1211401B (en) Video encoding apparatus and video decoding apparatus
WO2025148495A1 (en) Method and apparatus for coefficient coding in video coding
JP4008846B2 (en) Image encoding apparatus, image encoding method, image encoding program, and recording medium recording the program
WO2024163447A1 (en) System and method for filtered intra block copy prediction