JP2015211269A

JP2015211269A - Moving image encoding device, moving image encoding method and computer program for moving image encoding

Info

Publication number: JP2015211269A
Application number: JP2014090381A
Authority: JP
Inventors: 幸二山田; Koji Yamada
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2014-04-24
Filing date: 2014-04-24
Publication date: 2015-11-24
Anticipated expiration: 2034-04-24
Also published as: JP6248783B2

Abstract

PROBLEM TO BE SOLVED: To provide a moving image encoding device capable of selecting such an orthogonal transform size that image quality deterioration of a decoded picture can be suppressed.SOLUTION: A moving image encoding device 1 includes: a complexity degree calculation section 11 for calculating a complexity degree representing complexity of a scene captured in an encoding target block within a picture; a size selection section 13 by which, regarding each of a plurality of orthogonal transform sizes being different from each other, an evaluation value representing a code amount in performing the orthogonal transform on the encoding target block for the unit of a sub block having that orthogonal transform size is calculated while being weighted so as to be reduced as the complexity degree is reduced and the orthogonal transform size is small, and the orthogonal transform size minimizing the evaluation value is selected; and an encoding section 14 for encoding an orthogonal transform coefficient obtained by performing the orthogonal transform on a prediction error image between the encoding target block and a prediction block for the unit of a sub block having the selected orthogonal transform size.

Description

本発明は、例えば、動画像データに含まれるピクチャを複数のブロックに分割し、ブロックごとに符号化する動画像符号化装置、動画像符号化方法、及び動画像符号化用コンピュータプログラムに関する。 The present invention relates to, for example, a moving picture coding apparatus, a moving picture coding method, and a moving picture coding computer program for dividing a picture included in moving picture data into a plurality of blocks and coding each block.

動画像データは、一般に非常に大きなデータ量を有する。そのため、動画像データを扱う装置は、動画像データを他の装置へ送信しようとする場合、あるいは、動画像データを記憶装置に記憶しようとする場合、動画像データを符号化することにより圧縮する。 The moving image data generally has a very large amount of data. Therefore, a device that handles moving image data compresses the moving image data by encoding it when transmitting the moving image data to another device or when storing the moving image data in the storage device. .

その際、符号化対象のピクチャは、複数のブロックに分割され、ブロックごとに符号化される。動画像データを符号化する動画像符号化装置は、符号化対象ブロックに対して、そのブロックが含まれるピクチャよりも前に符号化されたピクチャ、あるいは、そのブロックが含まれるピクチャ内の既に符号化されたブロックから予測ブロックを生成する。動画像符号化装置は、符号化対象ブロックと予測ブロック間の空間的な相関が高いことを利用して冗長性を低減するために、符号化対象ブロックと予測ブロック間の誤差を表す予測誤差画像を算出し、ブロック内の画素の空間的な相関が高いことを利用して冗長性を低減する直交変換により、直交変換係数を求める。そして動画像符号化装置は、その直交変換係数を量子化した後に可変長符号化する。 At that time, the picture to be encoded is divided into a plurality of blocks and encoded for each block. A moving image encoding apparatus that encodes moving image data encodes, with respect to an encoding target block, a picture encoded before a picture including the block, or an already encoded image within the picture including the block. A prediction block is generated from the converted block. In order to reduce redundancy by using a high spatial correlation between the encoding target block and the prediction block, the moving image encoding apparatus represents a prediction error image representing an error between the encoding target block and the prediction block. And the orthogonal transform coefficient is obtained by orthogonal transform that reduces redundancy by using the high spatial correlation of the pixels in the block. Then, the moving picture coding apparatus quantizes the orthogonal transform coefficient and then performs variable length coding.

さらに、動画像符号化方式の一つである、H.264 MPEG-4 Advanced Video Coding（H.264 MPEG-4 AVC）あるいはH.265(High Efficiency Video Coding(HEVC)と呼ばれることもある)では、動画像符号化装置は、ピクチャに応じて、予測単位あるいは直交変換単位となるブロックのサイズを選択可能である。そこで、動画像符号化装置は、符号化対象ブロックについて、予め用意された複数の分割モード、予測モード及び変換サイズモードの組み合わせごとに、符号化情報量の評価値を計算する。そして動画像符号化装置は、評価値が最小となる組み合わせでそのブロックを符号化する（例えば、特許文献１を参照）。 Furthermore, H.264 MPEG-4 Advanced Video Coding (H.264 MPEG-4 AVC) or H.265 (sometimes called High Efficiency Video Coding (HEVC)), which is one of the video coding systems. The video encoding apparatus can select the size of a block that is a prediction unit or an orthogonal transform unit according to a picture. Therefore, the video encoding apparatus calculates an evaluation value of the encoded information amount for each combination of a plurality of division modes, prediction modes, and transform size modes prepared in advance for the encoding target block. Then, the moving image encoding device encodes the block with a combination that minimizes the evaluation value (see, for example, Patent Document 1).

特開２００７−２４３４２７号公報JP 2007-243427 A

ピクチャに写っているシーンが、グラウンドに投げられたボールが転がるシーンのように、比較的平坦な背景の中を小さな物体が高速で移動するシーンであることがある。このような場合、ピクチャ内に特徴的なところが少なく、かつ、物体の移動量が大きいため、そのピクチャを他のピクチャを参照して符号化するインター予測符号化モードにより符号化する場合、動き予測が適切でないことがある。すなわち、予測ブロックを生成するための動き探索処理の結果、背景だけが写っている領域が選択されてしまい、予測ブロックに移動する物体が含まれないことがある。またこのようなシーンでは、ピクチャ内の既に符号化された領域の情報を利用して符号化対象ブロックを符号化するイントラ予測符号化モードで符号化対象ブロックを符号化する場合でも、予測ブロックは背景だけが写っている領域から作成される。そのため、予測誤差画像では、その移動する物体が存在する領域に含まれる画素は0以外の値を持つものの、その他の領域に含まれる画素はほぼ0となる。このような場合、ブロックサイズが大きいほど、符号化量の評価値は小さくなるので、直交変換のブロックのサイズとして、相対的に大きなブロックが選択されることになる。直交変換のブロックのサイズが大きくなるほど、小さな移動物体が含まれる領域がそのブロックに占める割合が小さくなるので、その領域に起因する直交変換係数は、絶対値が比較的小さい高周波成分となる。その結果、量子化処理によってその移動物体が含まれる領域に起因する、量子化された直交変換係数が0となって、その移動物体に関する情報が失われてしまい、復号されたピクチャ上では、その移動物体が存在しなくなってしまう。このように、直交変換の単位となるブロックについて適切なサイズが選択されないと、復号されたピクチャの画質が劣化してしまう。 The scene shown in the picture may be a scene in which a small object moves at high speed in a relatively flat background, such as a scene in which a ball thrown to the ground rolls. In such a case, since there are few characteristic parts in the picture and the amount of movement of the object is large, when the picture is coded by the inter prediction coding mode in which the picture is coded with reference to other pictures, motion prediction is performed. May not be appropriate. That is, as a result of the motion search process for generating the prediction block, an area in which only the background is shown is selected, and the moving object may not be included in the prediction block. In such a scene, even when the encoding target block is encoded in the intra prediction encoding mode in which the encoding target block is encoded using the information of the already encoded area in the picture, the prediction block is It is created from the area where only the background is shown. Therefore, in the prediction error image, the pixels included in the region where the moving object exists have a value other than 0, but the pixels included in the other regions are almost zero. In such a case, the larger the block size, the smaller the evaluation value of the coding amount. Therefore, a relatively large block is selected as the orthogonal transform block size. As the size of the orthogonal transform block increases, the ratio of the area including the small moving object to the block decreases. Therefore, the orthogonal transform coefficient resulting from the area is a high-frequency component having a relatively small absolute value. As a result, the quantized orthogonal transform coefficient resulting from the region including the moving object by the quantization process becomes 0, and information about the moving object is lost. There is no moving object. Thus, if an appropriate size is not selected for a block that is a unit of orthogonal transform, the image quality of the decoded picture is degraded.

そこで、本明細書は、復号されたピクチャの画質劣化を抑制できる直交変換サイズを選択可能な動画像符号化装置を提供することを目的とする。 Accordingly, an object of the present specification is to provide a moving picture coding apparatus capable of selecting an orthogonal transform size that can suppress degradation in image quality of a decoded picture.

一つの実施形態によれば、動画像データに含まれるピクチャを符号化する動画像符号化装置が提供される。この動画像符号化装置は、ピクチャを複数のブロックに分割する分割部と、複数のブロックのうちの符号化対象ブロックに写っているシーンの複雑さを表す複雑度を算出する複雑度算出部と、符号化対象ブロックを含む符号化対象ピクチャ、または、符号化対象ピクチャよりも前に符号化されたピクチャから符号化対象ブロックの予測ブロックを生成する予測ブロック生成部と、互いに異なる複数の直交変換サイズのそれぞれについて、符号化対象ブロックをその直交変換サイズを持つサブブロック単位で直交変換したときの符号量を表す評価値を、複雑度が低いほど、小さい直交変換サイズほど評価値が小さくなるように重み付けして算出し、評価値が最小となる直交変換サイズを選択するサイズ選択部と、符号化対象ブロックと予測ブロック間の予測誤差画像を選択された直交変換サイズを持つサブブロック単位で直交変換して得られる直交変換係数を符号化する符号化部とを有する。 According to one embodiment, a moving image encoding apparatus that encodes a picture included in moving image data is provided. The moving image encoding apparatus includes: a dividing unit that divides a picture into a plurality of blocks; a complexity calculating unit that calculates complexity representing the complexity of a scene shown in an encoding target block among the plurality of blocks; A prediction block generation unit that generates a prediction block of a coding target block from a coding target picture including a coding target block or a picture encoded before the coding target picture, and a plurality of orthogonal transforms different from each other For each of the sizes, the evaluation value indicating the code amount when the encoding target block is orthogonally transformed in units of sub-blocks having the orthogonal transformation size is such that the lower the complexity, the smaller the evaluation value becomes. The size selection unit that selects the orthogonal transform size that minimizes the evaluation value, the encoding target block, and the prediction block And an encoding unit for encoding the orthogonal transform coefficients obtained by orthogonal transformation in units of sub-blocks having the selected orthogonal transformation size prediction error image between click.

本発明の目的及び利点は、請求項において特に指摘されたエレメント及び組み合わせにより実現され、かつ達成される。
上記の一般的な記述及び下記の詳細な記述の何れも、例示的かつ説明的なものであり、請求項のように、本発明を限定するものではないことを理解されたい。 The objects and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims.
It should be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention as claimed.

本明細書に開示された動画像符号化装置は、復号されたピクチャの画質劣化を抑制できる直交変換サイズを選択できる。 The moving image encoding apparatus disclosed in this specification can select an orthogonal transform size that can suppress degradation in image quality of a decoded picture.

直交変換を適用するブロックのサイズによる、復号されたピクチャの画質への影響を説明する図である。It is a figure explaining the influence on the image quality of the decoded picture by the size of the block which applies orthogonal transformation. 一つの実施形態に係る動画像符号化装置の概略構成図である。It is a schematic block diagram of the moving image encoder which concerns on one embodiment. 符号化対象ブロックの複雑度と重み係数の関係を示す図である。It is a figure which shows the relationship between the complexity of an encoding object block, and a weighting coefficient. 動画像符号化処理の動作フローチャートである。It is an operation | movement flowchart of a moving image encoding process. 変形例による、サブブロックの複雑度と重み係数の関係を示す図である。It is a figure which shows the relationship between the complexity of a subblock, and a weighting coefficient by a modification. HEVCによるピクチャ分割の一例を示す図である。It is a figure which shows an example of the picture division | segmentation by HEVC. HEVCに準拠して直交変換の適用サイズを決定する際の処理の一例を示す図である。It is a figure which shows an example of the process at the time of determining the application size of orthogonal transformation based on HEVC. 実施形態またはその変形例による動画像符号化装置の各部の機能を実現するコンピュータプログラムが動作することにより、動画像符号化装置として動作するコンピュータの構成図である。It is a block diagram of the computer which operate | moves as a moving image encoder by the computer program which implement | achieves the function of each part of the moving image encoder by embodiment or its modification.

以下、図を参照しつつ、動画像符号化装置について説明する。最初に、符号化対象ブロック内の直交変換を適用するブロックのサイズの影響について説明する。 Hereinafter, the moving picture coding apparatus will be described with reference to the drawings. First, the influence of the block size to which the orthogonal transform in the encoding target block is applied will be described.

図１は、直交変換を適用するブロックのサイズによる、復号されたピクチャの画質への影響を説明する図である。１６×１６画素のサイズを持つ予測誤差画像１００には、移動物体を表す領域１０１が含まれる。この領域１０１は小さく、例えば、４×４画素未満のサイズであるとする。予測誤差画像１００が、直交変換の適用単位である、４×４画素のサイズを持つサブブロックごとに分割され、そのうちの一つのサブブロック１１０に領域１０１が含まれるとする。この場合、各サブブロックが直交変換されると、サブブロック１１０のサイズが小さいために、サブブロック１１０では、領域１０１の影響で、直交変換係数は比較的大きな値となる。なお、サブブロック１１０以外のサブブロックでは、予測誤差画像１００が平坦なため、直交変換係数は直流成分を除いてほぼ０となる。直交変換係数が比較的大きな値となるため、サブブロック１１０の直交変換係数が量子化されたサブブロック１２０に示されるように、サブブロック１２０においては、直流成分以外の量子化された直交変換係数の少なくとも一部は０以外の値を持つ。その結果、復号されたピクチャ１３０でも、領域１０１が再現される。 FIG. 1 is a diagram for explaining the influence on the image quality of a decoded picture due to the size of a block to which orthogonal transform is applied. A prediction error image 100 having a size of 16 × 16 pixels includes a region 101 representing a moving object. This region 101 is small, for example, a size less than 4 × 4 pixels. It is assumed that the prediction error image 100 is divided into sub-blocks each having a size of 4 × 4 pixels, which is an application unit of orthogonal transformation, and one of the sub-blocks 110 includes the region 101. In this case, when each sub-block is orthogonally transformed, the size of the sub-block 110 is small. Therefore, in the sub-block 110, the orthogonal transformation coefficient becomes a relatively large value due to the influence of the region 101. Note that, in the sub-blocks other than the sub-block 110, the prediction error image 100 is flat, so that the orthogonal transform coefficient is almost 0 except for the DC component. Since the orthogonal transform coefficient has a relatively large value, as shown in the sub-block 120 in which the orthogonal transform coefficient of the sub-block 110 is quantized, in the sub-block 120, the quantized orthogonal transform coefficient other than the DC component is used. At least some of them have non-zero values. As a result, the area 101 is also reproduced in the decoded picture 130.

一方、予測誤差画像１００全体が一つのブロック１４０として直交変換される場合、ブロック１４０に対して領域１０１は相対的に小さいため、直交変換係数の絶対値も小さくなる。その結果、直交変換係数が量子化されたブロック１５０に示されるように、量子化処理により、直流以外の成分について、量子化された直交変換係数は０となってしまい、領域１０１の情報が消失する。その結果、復号されたピクチャ１６０において、領域１０１が消失する。 On the other hand, when the entire prediction error image 100 is orthogonally transformed as one block 140, since the region 101 is relatively small with respect to the block 140, the absolute value of the orthogonal transformation coefficient is also small. As a result, as shown in the block 150 in which the orthogonal transform coefficient is quantized, the quantized orthogonal transform coefficient becomes 0 for the components other than the direct current due to the quantization process, and the information in the region 101 is lost. To do. As a result, the area 101 disappears in the decoded picture 160.

このような、小さな領域の消失による、復号されたピクチャの画質劣化は、ピクチャに写っているシーンが比較的複雑な場合には視聴者に気付かれ難いものの、背景が比較的平坦な場合には目立ってしまう。 Such degradation of the picture quality of a decoded picture due to the disappearance of a small area is difficult for the viewer to notice when the scene shown in the picture is relatively complex, but when the background is relatively flat. It will stand out.

そこで、本実施形態では、動画像符号化装置は、符号化対象ブロックについて、そのブロックに写っているシーンの複雑さの度合いを表す複雑度を算出する。そして動画像符号化装置は、直交変換の適用単位となるブロックのサイズを決定する際に、複雑度が低いほど小さくなる重み係数で、直交変換サイズが小さいほど符号量が増加する項の符号量を重み付けして、符号化対象ブロックの符号量の評価値を算出する。これにより、この動画像符号化装置は、平坦なシーンが写っているブロックについては、直交変換が適用されるブロックのサイズとして比較的小さなサイズを選択し易くする。 Therefore, in the present embodiment, the moving image encoding apparatus calculates a complexity representing the degree of complexity of a scene shown in the block to be encoded. Then, when determining the size of a block that is an application unit of orthogonal transform, the video encoding apparatus uses a weighting factor that decreases as complexity decreases, and a code amount of a term that increases as the orthogonal transform size decreases. , And an evaluation value of the code amount of the encoding target block is calculated. As a result, the moving picture coding apparatus makes it easy to select a relatively small size as a block size to which the orthogonal transformation is applied for a block in which a flat scene is captured.

なお、ピクチャは、フレームまたはフィールドの何れであってもよい。フレームは、動画像データ中の一つの静止画像であり、一方、フィールドは、フレームから奇数行のデータあるいは偶数行のデータのみを取り出すことにより得られる静止画像である。 Note that the picture may be either a frame or a field. The frame is one still image in the moving image data, while the field is a still image obtained by extracting only odd-numbered data or even-numbered data from the frame.

図２は、一つの実施形態による動画像符号化装置の概略構成図である。動画像符号化装置１は、分割部１０と、複雑度算出部１１と、予測ブロック生成部１２と、サイズモード選択部１３と、符号化部１４とを有する。また符号化部１４は、予測誤差算出部１５と、直交変換部１６と、量子化部１７と、復号部１８と、記憶部１９と、可変長符号化部２０とを有する。
動画像符号化装置１が有するこれらの各部は、それぞれ別個の回路として形成される。あるいは動画像符号化装置１が有するこれらの各部は、その各部に対応する回路が集積された一つの集積回路として動画像符号化装置１に実装されてもよい。さらに、動画像符号化装置１が有するこれらの各部は、動画像符号化装置１が有するプロセッサ上で実行されるコンピュータプログラムにより実現される、機能モジュールであってもよい。 FIG. 2 is a schematic configuration diagram of a moving image encoding apparatus according to an embodiment. The moving image encoding device 1 includes a dividing unit 10, a complexity calculating unit 11, a prediction block generating unit 12, a size mode selecting unit 13, and an encoding unit 14. The encoding unit 14 includes a prediction error calculation unit 15, an orthogonal transform unit 16, a quantization unit 17, a decoding unit 18, a storage unit 19, and a variable length encoding unit 20.
Each of these units included in the moving image encoding apparatus 1 is formed as a separate circuit. Alternatively, these units included in the video encoding device 1 may be mounted on the video encoding device 1 as one integrated circuit in which circuits corresponding to the respective units are integrated. Furthermore, each of these units included in the moving image encoding device 1 may be a functional module realized by a computer program executed on a processor included in the moving image encoding device 1.

動画像符号化装置１全体を制御する制御部（図示せず）により、例えば、Group Of Pictures（GOP）により指定された符号化順序に従って、動画像データに含まれるピクチャが、順次、分割部１０に入力される。なお、GOPは、連続する複数のピクチャを含み、各ピクチャに対する符号化方法が規定された構造を表す。
分割部１０は、符号化対象となるピクチャを所定数の画素を持つ複数のブロックに分割する。各ブロックは、直交変換の適用サイズのうちの最大サイズ、例えば、横16画素×縦16画素のサイズを持つ。そして分割部１０は、例えば、ラスタスキャン順に各ブロックを出力する。 A control unit (not shown) that controls the entire moving image coding apparatus 1 sequentially converts pictures included in moving image data into a dividing unit 10 according to the coding order specified by, for example, Group Of Pictures (GOP). Is input. A GOP includes a plurality of consecutive pictures and represents a structure in which an encoding method for each picture is defined.
The dividing unit 10 divides a picture to be encoded into a plurality of blocks having a predetermined number of pixels. Each block has a maximum size among applicable sizes of orthogonal transform, for example, a size of 16 horizontal pixels × 16 vertical pixels. Then, the dividing unit 10 outputs each block in the raster scan order, for example.

複雑度算出部１１は、分割部１０から出力されたブロックのうち、符号化対象ブロックに写っているシーンの複雑さの度合いを表す複雑度を計算する。この複雑度は、直交変換するブロックのサイズ及び予測モードの決定の際の符号化量の評価値を算出する際の重み係数の決定に利用される。なお、予測モードは、予測ブロックの生成方法を規定し、インター予測符号化モードかイントラ予測符号化モードかを表す。さらに、予測モードは、イントラ予測符号化モードが選択される場合には予測ブロックの生成方法を規定するイントラ予測モードを表し、インター予測符号化モードが選択される場合には、一方向の予測モードか双方向の予測モードかを表す。 The complexity calculation unit 11 calculates a complexity representing the degree of complexity of the scene shown in the encoding target block among the blocks output from the dividing unit 10. This complexity is used to determine the weighting coefficient when calculating the evaluation value of the coding amount when determining the size of the block to be orthogonally transformed and the prediction mode. Note that the prediction mode defines a method for generating a prediction block, and represents an inter prediction encoding mode or an intra prediction encoding mode. Furthermore, the prediction mode represents an intra prediction mode that defines a method for generating a prediction block when the intra prediction coding mode is selected, and a unidirectional prediction mode when the inter prediction coding mode is selected. Or bi-directional prediction mode.

本実施形態では、複雑度算出部１１は、ブロック内の全画素の平均画素値と各画素の差分絶対値の総和（アクティビティと呼ばれる）を算出する。アクティビティは、次式で算出される。

ここでpiは、ブロック内の画素の値であり、mは、ブロック内の全画素の平均画素値である。Mは、ブロック内に含まれる画素の総数であり、例えば、ブロックが16×16画素のサイズを持つ場合、M=256である。そしてActはブロックのアクティビティである。 In the present embodiment, the complexity calculation unit 11 calculates the sum (called an activity) of the average pixel value of all the pixels in the block and the absolute difference value of each pixel. The activity is calculated by the following formula.

Here, pi is a value of a pixel in the block, and m is an average pixel value of all the pixels in the block. M is the total number of pixels included in the block. For example, when the block has a size of 16 × 16 pixels, M = 256. Act is a block activity.

なお、複雑度算出部１１は、ブロックのアクティビティの代わりに、次式に従って、１画素あたりのアクティビティPixActを、複雑度として算出してもよい。

あるいは、複雑度算出部１１は、ブロック内の画素値の分散を、複雑度として算出してもよい。 The complexity calculation unit 11 may calculate the activity PixAct per pixel as the complexity according to the following equation instead of the block activity.

Alternatively, the complexity calculation unit 11 may calculate the variance of the pixel values in the block as the complexity.

複雑度算出部１１は、複雑度をサイズモード選択部１３へ出力する。 The complexity calculation unit 11 outputs the complexity to the size mode selection unit 13.

予測ブロック生成部１２は、符号化対象ブロックについて適用可能な予測モードごとに予測ブロックを生成する。 The prediction block generation unit 12 generates a prediction block for each prediction mode applicable to the encoding target block.

例えば、符号化対象ピクチャがIピクチャであれば、予測ブロック生成部１２は、予測ブロックの生成方法を規定するイントラ予測モードごとに、符号化対象ブロックの左側または上側に隣接する符号化済みのブロックから予測ブロックを生成する。なお、符号化対象ピクチャは、符号化対象ブロックを含むピクチャである。またIピクチャは、イントラ予測符号化の対象となり、インター予測符号化されないピクチャである。 For example, if the encoding target picture is an I picture, the prediction block generating unit 12 encodes an encoded block adjacent to the left side or the upper side of the encoding target block for each intra prediction mode that defines a prediction block generation method. Generate a prediction block from The encoding target picture is a picture including the encoding target block. The I picture is a picture that is a target of intra prediction encoding and is not subjected to inter prediction encoding.

また、符号化対象ピクチャがPピクチャであれば、予測ブロック生成部１２は、例えば、符号化済みのピクチャを復号して得られた参照ピクチャと符号化対象ブロック間の動き探索を行う。そして予測ブロック生成部１２は、符号化対象ブロックと最も一致する参照ピクチャ上の領域を示す動きベクトルを算出し、その動きベクトルで示された領域を予測ブロックとする。また、符号化対象ピクチャがBピクチャであれば、予測ブロック生成部１２は、２方向、例えば、再生時間順で符号化対象ピクチャよりも前のピクチャと後のピクチャの両方で動き探索して、二つのピクチャ上で符号化対象ブロックと最も一致する領域を求める。そして予測ブロック生成部１２は、それら二つの領域内の対応する位置にある画素の値の平均値を求めて、予測ブロックとする。さらに、予測ブロック生成部１２は、PピクチャまたはBピクチャについては、符号化対象ピクチャがIピクチャの場合と同様に、イントラ予測モードごとの予測ブロックを生成してもよい。さらに、予測ブロック生成部１２は、Bピクチャについては、Pピクチャと同様に、一方向の動き予測により予測ブロックを生成してもよい。なお、Pピクチャは、既に符号化されている１枚のピクチャの情報を用いてインター予測符号化されることが可能なピクチャである。また、Bピクチャは、既に符号化されている２枚のピクチャの情報を用いて双方向のインター予測符号化されることが可能なピクチャである。Pピクチャ内の各ブロックは、イントラ予測符号化されてもよい。また、Bピクチャ内の各ブロックも、イントラ予測符号化されてもよく、あるいは、一方向についてインター予測符号化されてもよい。 Also, if the picture to be encoded is a P picture, the prediction block generation unit 12 performs a motion search between a reference picture obtained by decoding an encoded picture and the block to be encoded, for example. Then, the prediction block generation unit 12 calculates a motion vector indicating a region on the reference picture that most closely matches the encoding target block, and sets the region indicated by the motion vector as a prediction block. Also, if the encoding target picture is a B picture, the prediction block generation unit 12 performs motion search in both the pictures preceding and following the encoding target picture in two directions, for example, in the order of playback time, An area that most closely matches the encoding target block on two pictures is obtained. And the prediction block production | generation part 12 calculates | requires the average value of the value of the pixel in the corresponding position in these two area | regions, and makes it a prediction block. Furthermore, for the P picture or B picture, the prediction block generation unit 12 may generate a prediction block for each intra prediction mode, as in the case where the encoding target picture is an I picture. Further, the prediction block generation unit 12 may generate a prediction block for the B picture by unidirectional motion prediction, similarly to the P picture. Note that a P picture is a picture that can be inter-predictively encoded using information of one already encoded picture. A B picture is a picture that can be bidirectionally inter-predictively encoded using information of two already encoded pictures. Each block in the P picture may be intra prediction encoded. Also, each block in the B picture may be intra prediction encoded or may be inter prediction encoded in one direction.

予測ブロック生成部１２は、さらに、予測モードの適用対象となるブロックのサイズ、または、動き探索の対象となるブロックのサイズを変更できる場合には、符号化対象ブロックを分割した複数のサブブロックごとに、予測ブロックを生成してもよい。 When the prediction block generation unit 12 can further change the size of the block to which the prediction mode is applied or the size of the block to be subjected to motion search, the prediction block generation unit 12 can also change each of the plurality of subblocks obtained by dividing the encoding target block. In addition, a prediction block may be generated.

予測ブロック生成部１２は、符号化対象ブロックについて生成された予測ブロックと、その予測ブロックに対応する予測モード、動きベクトルなどをサイズモード選択部１３へ出力する。 The prediction block generation unit 12 outputs the prediction block generated for the encoding target block and the prediction mode and motion vector corresponding to the prediction block to the size mode selection unit 13.

サイズモード選択部１３は、サイズ選択部の一例であり、符号化対象ブロックに対して適用される直交変換の単位となるブロックのサイズ、及び、予測モードを決定する。例えば、直交変換のブロックのサイズは、16×16画素、8×8画素、及び、4×4画素のなかから選択される。そのために、サイズモード選択部１３は、直交変換単位となるブロックのサイズごと、かつ、予測ブロック生成部１２により生成された予測ブロックごとに、符号化対象ブロックの符号量の推定値である評価値Eを算出する。その際、サイズモード選択部１３は、評価値を、複雑度が低いほど、小さい直交変換サイズほど評価値が小さくなるように重み付けして算出する。

ここで、Dは、予測ブロックと符号化対象ブロック間の差分演算により算出される予測誤差画像の符号量の評価値である予測誤差画像評価値である。Rは、直交変換の適用サイズに応じて算出される、符号化項目情報量である。またαは、複雑度に応じて決定される重み係数であり、λは定数である。 The size mode selection unit 13 is an example of a size selection unit, and determines a block size and a prediction mode, which are units of orthogonal transform applied to an encoding target block. For example, the orthogonal transform block size is selected from 16 × 16 pixels, 8 × 8 pixels, and 4 × 4 pixels. Therefore, the size mode selection unit 13 evaluates the estimated value of the code amount of the encoding target block for each block size serving as an orthogonal transform unit and for each prediction block generated by the prediction block generation unit 12. E is calculated. At that time, the size mode selection unit 13 calculates the evaluation value by weighting so that the evaluation value becomes smaller as the complexity becomes lower and the smaller the orthogonal transform size becomes.

Here, D is a prediction error image evaluation value that is an evaluation value of the code amount of the prediction error image calculated by the difference calculation between the prediction block and the encoding target block. R is an encoded item information amount calculated according to the application size of orthogonal transform. Α is a weighting factor determined according to complexity, and λ is a constant.

サイズモード選択部１３は、直交変換の適用サイズごとの予測誤差画像評価値Dを求めるために、予測誤差画像を直交変換の適用サイズに応じてN個のサブブロックに分割し、個々のサブブロックをアダマール変換する。例えば、直交変換の適用サイズが8×8画素であれば、N=4であり、直交変換の適用サイズが4×4画素であれば、N=16である。サイズモード選択部１３は、アダマール変換により得られた係数C_ji(i=1,2,...,M、ただし、Mはサブブロックに含まれる画素数)の絶対値の総和SATD_j(j=1,2,...,N)をサブブロックごとに算出する。そして、サイズモード選択部１３は、各サブブロックのアダマール変換係数の絶対値の総和SATD_jの合計を予測誤差画像評価値Dとする。 The size mode selection unit 13 divides the prediction error image into N subblocks according to the application size of the orthogonal transform in order to obtain the prediction error image evaluation value D for each application size of the orthogonal transform. To Hadamard transform. For example, if the application size of orthogonal transformation is 8 × 8 pixels, N = 4, and if the application size of orthogonal transformation is 4 × 4 pixels, N = 16. The size mode selection unit 13 sums the absolute values of the coefficients C _ji (i = 1, 2,..., M, where M is the number of pixels included in the sub-block) obtained by Hadamard transform SATD _j (j = 1,2, ..., N) is calculated for each sub-block. Then, the size mode selection unit 13 sets the sum of the absolute sum SATD _j of the Hadamard transform coefficients of each sub-block as the prediction error image evaluation value D.

また、サイズモード選択部１３は、符号化項目情報量Rとして、直交変換サイズ分割フラグ、符号化対象ブロックに適用される量子化値、及び、係数有無フラグなどの符号量を算出する。なお、直交変換サイズ分割フラグは、直交変換の適用サイズを表す。また、係数有無フラグは、画素値が輝度の場合と色差の場合のそれぞれについて求められる、絶対値が０でない直交変換係数の有無を表し、直交変換されるサブブロックごとに求められる。したがって、直交変換のブロックサイズが小さくなるほど、係数有無フラグの数が増えるので、直交変換のブロックサイズが小さくなるほど、符号化項目情報量Rも増加する。 Further, the size mode selection unit 13 calculates, as the encoding item information amount R, code amounts such as an orthogonal transform size division flag, a quantization value applied to the encoding target block, and a coefficient presence / absence flag. Note that the orthogonal transform size division flag represents an application size of orthogonal transform. The coefficient presence / absence flag represents the presence / absence of an orthogonal transformation coefficient whose absolute value is not 0, which is obtained for each of the case where the pixel value is luminance and the color difference, and is obtained for each sub-block subjected to orthogonal transformation. Therefore, since the number of coefficient presence / absence flags increases as the block size for orthogonal transform decreases, the encoded item information amount R increases as the block size for orthogonal transform decreases.

重み係数αは、符号化対象ブロックの複雑度が低いほど、小さくなるように設定される。そして重み係数αは、（３）式に示されるように、評価値Eの算出の際に、直交変換のブロックサイズが小さくなるほど増加する符号化項目情報量Rを重み付けするために用いられる。 The weighting factor α is set to be smaller as the complexity of the encoding target block is lower. The weighting coefficient α is used to weight the encoded item information amount R that increases as the block size of the orthogonal transform becomes smaller when the evaluation value E is calculated, as shown in the equation (3).

図３は、符号化対象ブロックの複雑度と重み係数の関係を示す図である。図３において、横軸は複雑度（この例では、１画素あたりのアクティビティ）を表し、縦軸は、重み係数αの値を表す。そしてグラフ３００は、複雑度と重み係数αの関係を表す。この例では、複雑度が０、すなわち、符号化対象ブロックに含まれる全ての画素が同一の値を持つ場合に重み係数αは、最小の0.2となる。そして複雑度が増加するほど、重み係数αも線形に増加し、複雑度が１０以上では、重み係数αも1で一定となる。 FIG. 3 is a diagram illustrating the relationship between the complexity of the block to be encoded and the weighting factor. In FIG. 3, the horizontal axis represents complexity (in this example, activity per pixel), and the vertical axis represents the value of the weighting factor α. The graph 300 represents the relationship between the complexity and the weighting factor α. In this example, when the complexity is 0, that is, when all the pixels included in the encoding target block have the same value, the weighting coefficient α is a minimum of 0.2. As the complexity increases, the weighting factor α also increases linearly. When the complexity is 10 or more, the weighting factor α is also constant at 1.

このように重み係数αが設定されることで、符号化対象ブロックに写っているシーンが平坦なほど、符号化項目情報量Rを含む項の値が小さくなる。そのため、符号化対象ブロックに写っているシーンが平坦なほど、直交変換の適用サイズとして小さなサイズが選択され易くなる。 By setting the weighting factor α in this way, the value of the term including the encoded item information amount R becomes smaller as the scene shown in the encoding target block becomes flatter. For this reason, the flatter the scene shown in the encoding target block, the easier it is to select a smaller size as the application size of the orthogonal transform.

サイズモード選択部１３は、評価値の最小値に対応する直交変換サイズと予測モードとを、符号化対象ブロックの符号化に利用するとして選択する。そこでサイズモード選択部１３は、選択された予測モードに従って生成された予測ブロックと、直交変換サイズとを、符号化部１４へ出力する。
さらに、サイズモード選択部１３は、選択された予測モードを表すパラメータ、及び、選択された予測モードで予測ブロックを生成するために利用される情報、例えば、動きベクトルも符号化部１４へ出力する。 The size mode selection unit 13 selects the orthogonal transform size and the prediction mode corresponding to the minimum evaluation value as being used for encoding the block to be encoded. Therefore, the size mode selection unit 13 outputs the prediction block generated according to the selected prediction mode and the orthogonal transform size to the encoding unit 14.
Further, the size mode selection unit 13 also outputs a parameter representing the selected prediction mode and information used to generate a prediction block in the selected prediction mode, for example, a motion vector, to the encoding unit 14. .

符号化部１４は、サイズモード選択部１３により選択された予測モードに従って生成された予測ブロックと、直交変換サイズとに従って、符号化対象ブロックを符号化する。すなわち、符号化部１４は、符号化対象ブロックと予測ブロック間の予測誤差画像を選択された直交変換サイズを持つサブブロック単位で直交変換して得られる直交変換係数を符号化する。 The encoding unit 14 encodes the encoding target block according to the prediction block generated according to the prediction mode selected by the size mode selection unit 13 and the orthogonal transform size. That is, the encoding unit 14 encodes an orthogonal transform coefficient obtained by orthogonal transform of a prediction error image between a coding target block and a prediction block in units of sub blocks having a selected orthogonal transform size.

そこで、符号化部１４の予測誤差算出部１５は、符号化対象ブロックと予測ブロックとの差分演算を実行する。そして予測誤差算出部１５は、その差分演算により得られたブロック内の各画素に対応する差分値を、予測誤差画像とする。予測誤差算出部１５は、予測誤差画像と、直交変換サイズとを符号化部１４の直交変換部１６へ出力する。 Therefore, the prediction error calculation unit 15 of the encoding unit 14 performs a difference calculation between the encoding target block and the prediction block. And the prediction error calculation part 15 makes the difference value corresponding to each pixel in the block obtained by the difference calculation a prediction error image. The prediction error calculation unit 15 outputs the prediction error image and the orthogonal transform size to the orthogonal transform unit 16 of the encoding unit 14.

直交変換部１６は、符号化対象ブロックの予測誤差画像を、選択された直交変換サイズのサブブロックに分割し、サブブロック単位で直交変換することにより、各サブブロックの直交変換係数を求める。例えば、直交変換部１６は、直交変換処理として、離散コサイン変換（Discrete Cosine Transform、DCT）を各サブブロックに対して実行することにより、直交変換係数として、サブブロックごとのDCT係数の組を得る。ただし、選択された直交変換サイズと符号化対象ブロックのサイズが同一である場合は、直交変換部１６は、符号化対象ブロック自体に対して直交変換処理を実行すればよい。 The orthogonal transform unit 16 divides the prediction error image of the encoding target block into sub blocks having the selected orthogonal transform size, and performs orthogonal transform in units of sub blocks, thereby obtaining orthogonal transform coefficients for each sub block. For example, the orthogonal transform unit 16 obtains a set of DCT coefficients for each sub-block as orthogonal transform coefficients by performing discrete cosine transform (DCT) on each sub-block as orthogonal transform processing. . However, when the selected orthogonal transform size and the size of the encoding target block are the same, the orthogonal transform unit 16 may perform orthogonal transform processing on the encoding target block itself.

直交変換部１６は、符号化対象ブロックの各サブブロックの直交変換係数を符号化部１４の量子化部１７へ出力する。 The orthogonal transform unit 16 outputs the orthogonal transform coefficient of each sub-block of the encoding target block to the quantization unit 17 of the encoding unit 14.

量子化部１７は、符号化対象ブロックの各直交変換係数を量子化することにより、その直交変換係数の量子化係数を算出する。この量子化処理は、一定区間に含まれる信号値を一つの信号値で表す処理である。そしてその一定区間は、量子化幅と呼ばれる。例えば、量子化部１７は、直交変換係数から、量子化幅に相当する所定数の下位ビットを切り捨てることにより、その直交変換係数を量子化する。量子化幅は、量子化パラメータによって決定される。例えば、量子化部１７は、量子化パラメータの値に対する量子化幅の値を表す関数にしたがって、使用される量子化幅を決定する。またその関数は、量子化パラメータの値に対する単調増加関数とすることができ、予め設定される。また量子化パラメータは、符号化対象ブロックを含む符号化対象ピクチャに割り当てられる符号量などに基づいて、例えば、制御部（図示せず）により決定され、量子化部１７に通知される。 The quantization unit 17 calculates the quantization coefficient of the orthogonal transform coefficient by quantizing each orthogonal transform coefficient of the encoding target block. This quantization process is a process that represents a signal value included in a certain section as one signal value. The fixed interval is called a quantization width. For example, the quantization unit 17 quantizes the orthogonal transform coefficient by truncating a predetermined number of lower bits corresponding to the quantization width from the orthogonal transform coefficient. The quantization width is determined by the quantization parameter. For example, the quantization unit 17 determines a quantization width to be used according to a function representing a quantization width value with respect to a quantization parameter value. The function can be a monotonically increasing function with respect to the value of the quantization parameter, and is set in advance. Also, the quantization parameter is determined by, for example, a control unit (not shown) based on the code amount allocated to the encoding target picture including the encoding target block, and is notified to the quantization unit 17.

量子化部１７は、符号化対象ブロックの量子化係数を、復号部１８及び可変長符号化部２０へ出力する。 The quantization unit 17 outputs the quantization coefficient of the block to be encoded to the decoding unit 18 and the variable length coding unit 20.

復号部１８は、符号化対象ブロックの量子化係数から、そのブロックよりも後のブロックを符号化するための参照ブロック及び参照ピクチャを生成する。そのために、復号部１８は、量子化係数に、量子化パラメータにより決定された量子化幅に相当する所定数を乗算することにより、量子化係数を逆量子化する。この逆量子化により、符号化対象ブロックの直交変換係数、例えば、DCT係数の組が復元される。その後、復号部１８は、直交変換係数を、適用された直交変換のサイズを持つサブブロックごとに逆直交変換処理する。例えば、直交変換部１６がDCTを用いて直交変換係数を算出している場合、復号部１８は、復元された直交変換係数に対して逆DCT処理を実行する。逆量子化処理及び逆直交変換処理を量子化信号に対して実行することにより、符号化前の予測誤差画像と同程度の情報を有する予測誤差画像が再生される。 The decoding unit 18 generates a reference block and a reference picture for encoding a block after the block from the quantization coefficient of the encoding target block. For this purpose, the decoding unit 18 inversely quantizes the quantization coefficient by multiplying the quantization coefficient by a predetermined number corresponding to the quantization width determined by the quantization parameter. By this inverse quantization, an orthogonal transform coefficient of the encoding target block, for example, a set of DCT coefficients is restored. Thereafter, the decoding unit 18 performs an inverse orthogonal transform process on the orthogonal transform coefficient for each sub-block having the size of the applied orthogonal transform. For example, when the orthogonal transform unit 16 calculates an orthogonal transform coefficient using DCT, the decoding unit 18 performs an inverse DCT process on the restored orthogonal transform coefficient. By performing the inverse quantization process and the inverse orthogonal transform process on the quantized signal, a prediction error image having the same level of information as the prediction error image before encoding is reproduced.

復号部１８は、予測ブロックの各画素値に、その画素に対応する再生された予測誤差信号を加算する。これらの処理を各ブロックについて実行することにより、復号部１８は、その後に符号化されるブロックに対する予測ブロックを生成するために利用される参照ブロックを生成する。また復号部１８は、参照ブロックのブロックノイズを軽減するために、参照ブロックに対してデブロッキングフィルタ処理を実行してもよい。
復号部１８は、参照ブロックを生成する度に、その参照ブロックを記憶部１９に記憶させる。 The decoding unit 18 adds the reproduced prediction error signal corresponding to the pixel to each pixel value of the prediction block. By executing these processes for each block, the decoding unit 18 generates a reference block used to generate a prediction block for a block to be encoded thereafter. The decoding unit 18 may perform a deblocking filter process on the reference block in order to reduce block noise of the reference block.
Each time the decoding unit 18 generates a reference block, the decoding unit 18 stores the reference block in the storage unit 19.

記憶部１９は、復号部１８から受け取った参照ブロックを一時的に記憶する。なお、参照ピクチャは、各ブロックの符号化順序にしたがって、１枚のピクチャ分の参照ブロックを結合することで得られる。記憶部１９は、予測ブロック生成部１２及びサイズモード選択部１３に、参照ピクチャまたは参照ブロックを供給する。なお、記憶部１９は、符号化対象ピクチャが参照する可能性がある、予め定められた所定枚数分の参照ピクチャを記憶し、参照ピクチャの枚数がその所定枚数を超えると、符号化順序が古い参照ピクチャから順に破棄する。 The storage unit 19 temporarily stores the reference block received from the decoding unit 18. A reference picture is obtained by combining reference blocks for one picture in accordance with the coding order of each block. The storage unit 19 supplies the reference picture or the reference block to the prediction block generation unit 12 and the size mode selection unit 13. The storage unit 19 stores a predetermined number of reference pictures that may be referred to by the encoding target picture. If the number of reference pictures exceeds the predetermined number, the encoding order is old. Discard in order from the reference picture.

可変長符号化部２０は、量子化部１７から受け取った量子化係数を、生起確率が高い信号値ほど短くなるように可変長符号化する。また可変長符号化部２０は、動きベクトルなどの予測ブロックの生成に利用される情報も可変長符号化する。可変長符号化部２０は、例えば、可変長符号化処理として、CAVLCといったハフマン符号化処理あるいはCABACといった算術符号化処理を用いることができる。 The variable length coding unit 20 performs variable length coding so that the quantization coefficient received from the quantization unit 17 is shorter as the signal value has a higher occurrence probability. The variable length coding unit 20 also performs variable length coding on information used for generating a prediction block such as a motion vector. The variable length coding unit 20 can use, for example, Huffman coding processing such as CAVLC or arithmetic coding processing such as CABAC as the variable length coding processing.

可変長符号化部２０により生成された符号化信号に対して、動画像符号化装置１は、ブロックごとの予測モードなどを含む所定の情報をヘッダ情報として付加することにより、符号化された動画像データを含むデータストリームを生成する。動画像符号化装置１は、そのデータストリームを磁気記憶媒体、光記憶媒体あるいは半導体メモリなどを有する記憶部（図示せず）に記憶するか、あるいはそのデータストリームを他の機器へ出力する。 The moving image encoding apparatus 1 adds predetermined information including a prediction mode for each block as header information to the encoded signal generated by the variable length encoding unit 20, thereby encoding the moving image. A data stream including image data is generated. The moving image encoding apparatus 1 stores the data stream in a storage unit (not shown) having a magnetic storage medium, an optical storage medium, a semiconductor memory, or the like, or outputs the data stream to another device.

図４は、動画像符号化装置１により実行される、動画像符号化処理の動作フローチャートである。動画像符号化装置１は、符号化対象ブロックごとに図４に示される動画像符号化処理を実行する。 FIG. 4 is an operation flowchart of the moving image encoding process executed by the moving image encoding device 1. The moving image encoding device 1 executes the moving image encoding process shown in FIG. 4 for each encoding target block.

複雑度算出部１１は、符号化対象ブロックの複雑度を算出する（ステップ１０１）。そして複雑度算出部１１は、複雑度をサイズモード選択部１３へ出力する。また予測ブロック生成部１２は、符号化対象ブロックについて適用可能な予測モードごとに予測ブロックを生成する（ステップＳ１０２）。予測ブロック生成部１２は、予測ブロックをサイズモード選択部１３へ出力する。 The complexity calculator 11 calculates the complexity of the encoding target block (step 101). Then, the complexity calculation unit 11 outputs the complexity to the size mode selection unit 13. Further, the prediction block generation unit 12 generates a prediction block for each prediction mode applicable to the encoding target block (step S102). The prediction block generation unit 12 outputs the prediction block to the size mode selection unit 13.

サイズモード選択部１３は、複雑度が低いほど、符号化項目情報量Rに対する重み係数αが小さくなるように重み係数αを決定する（ステップＳ１０３）。そしてサイズモード選択部１３は、符号化対象ブロックについて適用可能な予測モード及び適用可能な直交変換のブロックサイズごとに、その重み係数αで符号化項目情報量Rを重み付けして評価値Eを算出する（ステップＳ１０４）。サイズモード選択部１３は、評価値Eが最小となる予測モード及び直交変換のブロックサイズを、符号化対象ブロックに対して適用する予測モード及び直交変換のブロックサイズとする（ステップＳ１０５）。そしてサイズモード選択部１３は、適用する予測モードで生成された予測ブロック及び直交変換のブロックサイズを符号化部１４へ出力する。 The size mode selection unit 13 determines the weighting factor α so that the weighting factor α with respect to the encoded item information amount R decreases as the complexity decreases (step S103). Then, the size mode selection unit 13 calculates the evaluation value E by weighting the encoding item information amount R by the weighting coefficient α for each prediction mode applicable to the encoding target block and applicable block size of orthogonal transform. (Step S104). The size mode selection unit 13 sets the prediction mode and orthogonal transform block size that minimize the evaluation value E as the prediction mode and orthogonal transform block size to be applied to the encoding target block (step S105). Then, the size mode selection unit 13 outputs the prediction block generated in the prediction mode to be applied and the block size of the orthogonal transform to the encoding unit 14.

符号化部１４の予測誤差算出部１５は、符号化対象ブロックと予測ブロック間で差分演算を行って、予測誤差画像を算出する（ステップＳ１０６）。そして符号化部１４の直交変換部１６は、予測誤差画像を、適用される直交変換サイズを持つサブブロック単位で分割し、サブブロックごとに直交変換して直交変換係数を算出する（ステップＳ１０７）。 The prediction error calculation unit 15 of the encoding unit 14 performs a difference calculation between the encoding target block and the prediction block to calculate a prediction error image (step S106). Then, the orthogonal transform unit 16 of the encoding unit 14 divides the prediction error image in units of sub blocks having the applied orthogonal transform size, and performs orthogonal transform for each sub block to calculate orthogonal transform coefficients (step S107). .

符号化部１４の量子化部１７は、直交変換係数を量子化して量子化係数を求める（ステップＳ１０８）。そして量子化部１７は、量子化係数を符号化部１４の復号部１８及び可変長符号化部２０へ出力する。 The quantizing unit 17 of the encoding unit 14 obtains a quantized coefficient by quantizing the orthogonal transform coefficient (step S108). Then, the quantization unit 17 outputs the quantized coefficients to the decoding unit 18 and the variable length coding unit 20 of the coding unit 14.

復号部１８は、量子化係数を逆量子化して直交変換係数を再生し、その直交変換係数に対して、直交変換の際に適用されたブロックサイズのサブブロックごとに逆直交変換を適用して、予測誤差画像を再生する。そして復号部１８は、予測誤差画像の各画素の値を、予測ブロックの対応する画素の値に加算して、参照ブロックを求め、その参照ブロックを記憶部１９に記憶する（ステップＳ１０９）。 The decoding unit 18 dequantizes the quantized coefficient to reproduce the orthogonal transform coefficient, and applies the inverse orthogonal transform to the orthogonal transform coefficient for each sub-block having the block size applied in the orthogonal transform. Reproduce the prediction error image. And the decoding part 18 adds the value of each pixel of a prediction error image to the value of the corresponding pixel of a prediction block, calculates | requires a reference block, and memorize | stores the reference block in the memory | storage part 19 (step S109).

一方、可変長符号化部２０は、符号化対象ブロックに含まれる各量子化係数を可変長符号化する（ステップＳ１１０）。そして動画像符号化装置１は、符号化対象ブロックに対する動画像符号化処理を終了する。 On the other hand, the variable length coding unit 20 performs variable length coding on each quantization coefficient included in the coding target block (step S110). Then, the moving image encoding apparatus 1 ends the moving image encoding process for the encoding target block.

なお、動画像符号化装置１は、ステップＳ１０１の処理と、Ｓ１０２の処理の順序を入れ換えてもよく、あるいは、ステップＳ１０１の処理と、Ｓ１０２の処理を並列に実行してもよい。 Note that the moving image encoding apparatus 1 may exchange the order of the process of step S101 and the process of S102, or may execute the process of step S101 and the process of S102 in parallel.

以上に説明してきたように、この動画像符号化装置は、符号化対象ブロックの複雑度が低いほど、すなわち、符号化対象ブロックに写っているシーンが平坦であるほど、符号化項目情報量に対する重み係数を小さくする。そのため、符号化対象ブロックの複雑度が低いほど、直交変換の適用サイズとして小さなサイズが選ばれ易くなる。これにより、この動画像符号化装置は、平坦な背景とともに小さな移動物体が写っているようなシーンでも、復号画像においてその小さな移動物体が消失するような画質劣化を抑制できる、適切な直交変換サイズを選択できる。 As described above, this moving image encoding apparatus is adapted to the encoding item information amount as the complexity of the encoding target block is lower, that is, as the scene shown in the encoding target block is flatter. Reduce the weighting factor. Therefore, the smaller the complexity of the block to be encoded, the easier it is to select a smaller size as the application size for orthogonal transform. As a result, this moving image encoding apparatus can suppress an image quality deterioration in which a small moving object disappears in a decoded image even in a scene where a small moving object is reflected with a flat background. Can be selected.

変形例によれば、複雑度算出部１１は、符号化対象ブロックについて生成された予測ブロックから複雑度を算出してもよい。この場合において、予測ブロック生成部１２により複数の予測ブロックが生成されている場合、複雑度算出部１１は、各予測ブロックについて複雑度を算出し、その平均値を符号化対象ブロックについての複雑度としてもよい。 According to the modification, the complexity calculation unit 11 may calculate the complexity from the prediction block generated for the encoding target block. In this case, when a plurality of prediction blocks are generated by the prediction block generation unit 12, the complexity calculation unit 11 calculates the complexity for each prediction block, and calculates the average value for the complexity of the encoding target block. It is good.

あるいは、複雑度算出部１１は、各予測ブロックのうち、符号化対象ブロックと予測ブロック間の予測誤差画像の各画素の値の絶対値の総和が最小となる予測ブロックが、符号化対象ブロックに最も類似していると推定される。そこで複雑度算出部１１は、予測誤差画像の各画素の値の絶対値の総和が最小となる予測ブロックから複雑度を算出してもよい。 Alternatively, the complexity calculating unit 11 selects, as the encoding target block, a prediction block in which the total sum of the absolute values of the pixels of the prediction error image between the encoding target block and the prediction block is the smallest among the prediction blocks. Estimated to be most similar. Therefore, the complexity calculation unit 11 may calculate the complexity from a prediction block that minimizes the sum of the absolute values of the pixels of the prediction error image.

また他の変形例によれば、複雑度算出部１１は、符号化対象ブロックを直交変換の単位となるサブブロックに分割し、サブブロックごとに複雑度を算出してもよい。
この場合には、サイズモード選択部１３は、（３）式の代わりに、次式に従って評価値Eを算出することが好ましい。

ここで、SATD_jは、符号化対象ブロックの予測誤差画像をN個のサブブロックに分割したときのj番目のサブブロックについて算出される予測誤差評価値であり、アダマール変換係数C_jiの絶対値の総和として算出される。Rは符号化項目情報量であり、λは定数である。またβ_jは、j番目のサブブロックについて算出された複雑度に応じて決定される重み係数である。 According to another modification, the complexity calculating unit 11 may divide the encoding target block into sub-blocks that are units of orthogonal transform, and calculate the complexity for each sub-block.
In this case, it is preferable that the size mode selection unit 13 calculates the evaluation value E according to the following equation instead of the equation (3).

Here, SATD _j is a prediction error evaluation value calculated for the j-th sub-block when the prediction error image of the encoding target block is divided into N sub-blocks, and the absolute value of the Hadamard transform coefficient C _ji Is calculated as the sum of R is an encoded item information amount, and λ is a constant. Β _j is a weighting factor determined according to the complexity calculated for the j-th sub-block.

この変形例では、重み係数β_jは、複雑度が低いほど大きくなるように設定される。
図５は、この変形例による、サブブロックの複雑度と重み係数の関係を示す図である。図５において、横軸は複雑度（この例では、１画素あたりのアクティビティ）を表し、縦軸は、重み係数β_jの値を表す。そしてグラフ５００は、複雑度と重み係数β_jの関係を表す。この例では、複雑度が０、すなわち、符号化対象ブロックに含まれる全ての画素が同一の値を持つ場合に重み係数β_jは最大の1となる。そして複雑度が増加するほど、重み係数β_jは線形に減少し、複雑度が１０以上では、重み係数β_jは0.2で一定となる。 In this modification, the weighting factor β _j is set to increase as the complexity decreases.
FIG. 5 is a diagram showing the relationship between the complexity of sub-blocks and the weighting factor according to this modification. In FIG. 5, the horizontal axis represents complexity (in this example, activity per pixel), and the vertical axis represents the value of the weighting factor β _j . The graph 500 represents the relationship between the complexity and the weighting coefficient β _j . In this example, when the complexity is 0, that is, when all the pixels included in the encoding target block have the same value, the weight coefficient β _j is 1 at the maximum. As the complexity increases, the weight coefficient β _j decreases linearly. When the complexity is 10 or more, the weight coefficient β _j is constant at 0.2.

サブブロックのサイズが小さいほど、サブブロック内の少しの画素値のばらつきで複雑度が高くなる。特に、符号化対象ブロックに写っているシーンが平坦な背景のなかを小さな物体が移動しているようなシーンである場合、その移動物体を表す領域を含むサブブロックが大きいほど、そのサブブロックの複雑度は低下する。したがって、上記のように重み係数β_jが設定されることで、移動物体を表す領域を含むサブブロックについての予測誤差評価値に対する重み係数が、そのサブブロックのサイズが小さくなるほど小さくなる。その結果として、上記のようなシーンが写っている符号化対象ブロックでは、直交変換の適用サイズとして小さなサイズが選択され易くなる。 The smaller the size of the sub-block, the higher the complexity with a slight variation in pixel values within the sub-block. In particular, when the scene shown in the encoding target block is a scene in which a small object is moving in a flat background, the larger the subblock including the area representing the moving object, the larger the subblock. Complexity decreases. Therefore, by setting the weighting factor β _j as described above, the weighting factor for the prediction error evaluation value for the subblock including the region representing the moving object becomes smaller as the size of the subblock becomes smaller. As a result, a small size is easily selected as an application size of the orthogonal transformation in the encoding target block in which the scene as described above is reflected.

さらに他の変形例によれば、サイズモード選択部１３は、HEVCに準拠するように、予測モード及び直交変換の適用サイズを決定してもよい。 According to another modification, the size mode selection unit 13 may determine the application size of the prediction mode and the orthogonal transform so as to comply with HEVC.

図６は、HEVCによる、ピクチャの分割の一例を示す図である。図６に示されるように、ピクチャ６００は、符号化ブロックCoding Tree Unit(CTU)単位で分割され、各CTU６０１は、ラスタスキャン順に符号化される。CTU６０１のサイズは、64x64〜16x16画素の中から選択できる。 FIG. 6 is a diagram illustrating an example of picture division by HEVC. As shown in FIG. 6, a picture 600 is divided in coding block coding tree unit (CTU) units, and each CTU 601 is encoded in the raster scan order. The size of the CTU 601 can be selected from 64 × 64 to 16 × 16 pixels.

CTU６０１は、さらに、四分木構造で複数のCoding Unit（CU）６０２に分割される。一つのCTU６０１内の各CU６０２は、Zスキャン順に符号化される。CU６０２のサイズは可変であり、そのサイズは、CU分割モード8x8〜64x64画素の中から選択される。CU６０２は、符号化モードであるイントラ予測符号化モードとインター予測符号化モードを選択する単位となる。CU６０２は、Prediction Unit（PU）６０３単位またはTransform Unit（TU）６０４単位で個別に処理される。PU６０３は、符号化モードに応じた予測が行われる単位となる。例えば、PU６０３は、イントラ予測符号化モードでは、イントラ予測モードが適用される単位となり、インター予測符号化モードでは、動き補償を行う単位となる。PU６０３のサイズは、例えば、インター予測符号化では、PU分割モードPartMode =2Nx2N, NxN, 2NxN, Nx2N, 2NxU, 2NxnD, nRx2N, nLx2Nの中から選択できる。一方、TU６０４は、直交変換の単位であり、TU６０４のサイズは、4x4画素〜32x32画素の中から選択される。TU６０４は、四分木構造で分割され、Zスキャン順に処理される。 The CTU 601 is further divided into a plurality of Coding Units (CU) 602 in a quadtree structure. Each CU 602 in one CTU 601 is encoded in the Z scan order. The size of the CU 602 is variable, and the size is selected from CU division modes 8 × 8 to 64 × 64 pixels. The CU 602 is a unit for selecting an intra prediction encoding mode and an inter prediction encoding mode that are encoding modes. The CU 602 is individually processed in units of Prediction Unit (PU) 603 or Transform Unit (TU) 604. The PU 603 is a unit for performing prediction according to the encoding mode. For example, the PU 603 is a unit to which the intra prediction mode is applied in the intra prediction coding mode, and a unit for performing motion compensation in the inter prediction coding mode. The size of the PU 603 can be selected, for example, from the PU partition modes PartMode = 2Nx2N, NxN, 2NxN, Nx2N, 2NxU, 2NxnD, nRx2N, and nLx2N in inter prediction encoding. On the other hand, the TU 604 is a unit of orthogonal transformation, and the size of the TU 604 is selected from 4 × 4 pixels to 32 × 32 pixels. The TU 604 is divided by a quadtree structure and processed in the Z scan order.

この場合、複雑度算出部１１は、例えば、CUを符号化対象ブロックとして、複雑度を算出する。そして複雑度算出部１１は、CUについて算出した複雑度をサイズモード選択部１３へ出力する。 In this case, the complexity calculation unit 11 calculates the complexity, for example, with the CU as an encoding target block. Then, the complexity calculation unit 11 outputs the complexity calculated for the CU to the size mode selection unit 13.

サイズモード選択部１３は、符号化対象ブロックと、符号化対象ブロックをサブブロックに４分割したときのそれぞれについて評価値Eを算出する。例えば、符号化対象ブロックであるCUのサイズが32×32画素であるとする。この場合、サイズモード選択部１３は、CU自体を直交変換の適用単位とする場合と、CUを４個に等分割した16×16画素のサブブロックごとに直交変換する場合とで、それぞれ評価値Eを算出する。なお、適用可能な予測モードが複数存在する場合、サイズモード選択部１３は、各予測モードについても評価値Eを算出する。サイズモード選択部１３は、符号化対象ブロック自体を直交変換の適用単位とするときの評価値の方が、符号化対象ブロックを４分割したサブブロックを直交変換の適用単位とするときの評価値よりも小さい場合、符号化対象ブロック自体を直交変換の適用単位とする。一方、符号化対象ブロック自体を直交変換の適用単位とするときよりも、符号化対象ブロックを４分割したサブブロックを直交変換の適用単位とするときの評価値の方が小さい場合、サイズモード選択部１３は、個々のサブブロックを符号化対象ブロックとする。そしてサイズモード選択部１３は、個々の符号化対象ブロックについて、評価値の算出及び適用する直交変換サイズの選択を行う。その際、複雑度算出部１１も、新たに設定された符号化対象ブロックについて複雑度を算出する。 The size mode selection unit 13 calculates an evaluation value E for each of the encoding target block and the encoding target block when the encoding target block is divided into four sub blocks. For example, it is assumed that the size of the CU that is the encoding target block is 32 × 32 pixels. In this case, the size mode selection unit 13 evaluates each of the evaluation values in a case where the CU itself is an application unit of orthogonal transformation and a case where orthogonal transformation is performed for each 16 × 16 pixel sub-block obtained by equally dividing the CU into four. E is calculated. When there are a plurality of applicable prediction modes, the size mode selection unit 13 calculates an evaluation value E for each prediction mode. The size mode selection unit 13 uses the evaluation value when the encoding target block itself is the application unit of orthogonal transform, and the evaluation value when the sub-block obtained by dividing the encoding target block into four is the application unit of orthogonal transformation. Is smaller than the encoding target block itself as an application unit of orthogonal transform. On the other hand, when the evaluation value when the sub-block obtained by dividing the block to be encoded into four sub-blocks is set as the application unit of the orthogonal transform is smaller than the case where the encoding target block itself is set as the application unit of the orthogonal transform, The unit 13 sets each sub block as an encoding target block. Then, the size mode selection unit 13 calculates an evaluation value and selects an orthogonal transform size to be applied to each encoding target block. At that time, the complexity calculation unit 11 also calculates the complexity for the newly set encoding target block.

サイズモード選択部１３は、符号化対象ブロック自体が直交変換の適用単位となるか、サブブロックのサイズがTUサイズの最小設定値である4×4画素となるまで、上記の処理を繰り返す。 The size mode selection unit 13 repeats the above processing until the encoding target block itself becomes an application unit of orthogonal transformation or the size of the sub-block becomes 4 × 4 pixels which is the minimum setting value of the TU size.

図７は、HEVCに準拠して直交変換の適用サイズを決定する際の処理の一例を示す図である。なお、この例では、サイズモード選択部１３は、（３）式に従って評価値Eを算出する。しかし、複雑度算出部１１が、４分割するか否かの判断の対象となるブロックまたはサブブロックを４分割したサブブロックのそれぞれについて複雑度を算出することで、サイズモード選択部１３は、（４）式に従って評価値Eを算出してもよい。 FIG. 7 is a diagram illustrating an example of processing when determining the application size of orthogonal transform based on HEVC. In this example, the size mode selection unit 13 calculates the evaluation value E according to equation (3). However, the complexity mode calculation unit 11 calculates the complexity for each of the sub-blocks obtained by dividing the block or sub-block that is the target of determination of whether or not to divide into four into four, so that the size mode selection unit 13 ( 4) The evaluation value E may be calculated according to the equation.

最初に、複雑度算出部１１及びサイズモード選択部１３は、CU７００を符号化対象ブロックとして、CU７００自体をTUとする場合と、CU７００を４分割した16×16画素のサブブロックをTUとする場合とで、それぞれ評価値Eを算出する。そしてCU７００自体をTUとするときの評価値の方が、CU７００を４分割した16×16画素のサブブロックをTUとする場合の評価値よりも小さい場合、TU７１０は、CU７００と同サイズとなる。 First, the complexity calculator 11 and the size mode selector 13 use the CU 700 as an encoding target block, the CU 700 itself as a TU, and the 16 × 16 pixel sub-block obtained by dividing the CU 700 into four TUs. The evaluation value E is calculated respectively. If the evaluation value when the CU 700 itself is TU is smaller than the evaluation value when the 16 × 16 pixel sub-block obtained by dividing the CU 700 into TUs, the TU 710 has the same size as the CU 700.

一方、CU７００自体をTUとするときの評価値よりもCU７００を４分割した16×16画素のサブブロックをTUとする場合の評価値の方が小さい場合、複雑度算出部１１は、16×16画素のサブブロック７０１ごとに、複雑度を算出する。さらに、サイズモード選択部１３は、各サブブロック７０１をTUとするときの評価値と、各サブブロックをそれぞれ４分割した8×8画素のサブブロックをTUとするときの評価値を算出する。 On the other hand, when the evaluation value when the TU is a sub-block of 16 × 16 pixels obtained by dividing the CU 700 into four is smaller than the evaluation value when the CU 700 itself is set as the TU, the complexity calculation unit 11 calculates the 16 × 16 The complexity is calculated for each sub-block 701 of pixels. Further, the size mode selection unit 13 calculates an evaluation value when each subblock 701 is a TU, and an evaluation value when an 8 × 8 pixel subblock obtained by dividing each subblock into four is a TU.

各サブブロック７０１をTUとするときの評価値の方が、各サブブロック７０１をそれぞれ４分割した8×8画素のサブブロックをTUとするときの評価値よりも小さい場合、サイズモード選択部１３は、各サブブロック７０１をTU７１１とする。
一方、何れかのサブブロック７０１、例えば、右上のサブブロック７０１ａについて、そのサブブロック７０１ａをTUとするときの評価値よりも、そのサブブロック７０１ａを４分割したサブブロックをTUとするときの評価値の方が小さいとする。この場合、複雑度算出部１１は、サブブロック７０１ａを４分割した8×8画素のサブブロック７０２ごとに、複雑度を算出する。さらに、サイズモード選択部１３は、各サブブロック７０２をTUとするときの評価値と、各サブブロック７０２をそれぞれ４分割した4×4画素のサブブロックをTUとするときの評価値を算出する。 When the evaluation value when each sub-block 701 is set to TU is smaller than the evaluation value when the sub-block of 8 × 8 pixels obtained by dividing each sub-block 701 into four is set as TU, the size mode selection unit 13 , Each sub-block 701 is designated as TU 711.
On the other hand, for any of the sub-blocks 701, for example, the upper-right sub-block 701a, the evaluation when the sub-block 701a is divided into four TUs is used as the TU rather than the evaluation value when the sub-block 701a is set as the TU. Suppose the value is smaller. In this case, the complexity calculation unit 11 calculates the complexity for each 8 × 8 pixel sub-block 702 obtained by dividing the sub-block 701a into four. Further, the size mode selection unit 13 calculates an evaluation value when each subblock 702 is set as a TU, and an evaluation value when a subblock of 4 × 4 pixels obtained by dividing each subblock 702 into four is set as a TU. .

各サブブロック７０２をTUとするときの評価値が、各サブブロック７０２をそれぞれ４分割した4×4画素のサブブロックをTUとするときの評価値よりも小さい場合、サイズモード選択部１３は、サブブロック７０１ａ内の各サブブロック７０２をTU７１２とする。また、サブブロック７０１ａ以外のサブブロック７０１については、そのサブブロック７０１をTU７１１とする。
一方、何れかのサブブロック７０２、例えば、左下のサブブロック７０２ａについて、そのサブブロック７０２ａをTUとするときの評価値よりも、そのサブブロック７０２ａを４分割したサブブロックをTUとするときの評価値の方が小さいとする。この場合、サイズモード選択部１３は、サブブロック７０２ａ内の各サブブロックをTU７１３とする。また、サブブロック７０２ａ以外のサブブロック７０２については、そのサブブロック７０２をTU７１２とする。 When the evaluation value when each sub-block 702 is TU is smaller than the evaluation value when the sub-block of 4 × 4 pixels obtained by dividing each sub-block 702 into four is set as TU, the size mode selection unit 13 Each sub-block 702 in the sub-block 701a is set as TU712. Further, for subblocks 701 other than the subblock 701a, the subblock 701 is designated as TU711.
On the other hand, for any of the subblocks 702, for example, the lower left subblock 702a, the evaluation when the subblock 702a is divided into four TUs is used as the TU rather than the evaluation value when the subblock 702a is set as the TU. Suppose the value is smaller. In this case, the size mode selection unit 13 sets each subblock in the subblock 702a as TU713. For subblocks 702 other than the subblock 702a, the subblock 702 is designated as TU712.

この変形例によれば、動画像符号化装置は、上記の実施形態と同様に、画質劣化を抑制できる、適切な直交変換サイズを選択できるとともに、HEVCに準拠して、直交変換サイズを選択できる。 According to this modification, the moving image coding apparatus can select an appropriate orthogonal transform size that can suppress image quality degradation and can also select an orthogonal transform size based on HEVC, as in the above-described embodiment. .

図８は、上記の実施形態またはその変形例による動画像符号化装置の各部の機能を実現するコンピュータプログラムが動作することにより、動画像符号化装置として動作するコンピュータの構成図である。 FIG. 8 is a configuration diagram of a computer that operates as a moving image encoding apparatus when a computer program that realizes the functions of the respective units of the moving image encoding apparatus according to the above-described embodiment or its modification is operated.

コンピュータ１００は、ユーザインターフェース部１０１と、通信インターフェース部１０２と、記憶部１０３と、記憶媒体アクセス装置１０４と、プロセッサ１０５とを有する。プロセッサ１０５は、ユーザインターフェース部１０１、通信インターフェース部１０２、記憶部１０３及び記憶媒体アクセス装置１０４と、例えば、バスを介して接続される。 The computer 100 includes a user interface unit 101, a communication interface unit 102, a storage unit 103, a storage medium access device 104, and a processor 105. The processor 105 is connected to the user interface unit 101, the communication interface unit 102, the storage unit 103, and the storage medium access device 104 via, for example, a bus.

ユーザインターフェース部１０１は、例えば、キーボードとマウスなどの入力装置と、液晶ディスプレイといった表示装置とを有する。または、ユーザインターフェース部１０１は、タッチパネルディスプレイといった、入力装置と表示装置とが一体化された装置を有してもよい。そしてユーザインターフェース部１０１は、例えば、ユーザの操作に応じて、符号化する動画像データを選択する操作信号をプロセッサ１０５へ出力する。 The user interface unit 101 includes, for example, an input device such as a keyboard and a mouse, and a display device such as a liquid crystal display. Alternatively, the user interface unit 101 may include a device such as a touch panel display in which an input device and a display device are integrated. Then, the user interface unit 101 outputs, for example, an operation signal for selecting moving image data to be encoded to the processor 105 in accordance with a user operation.

通信インターフェース部１０２は、コンピュータ１００を、動画像データを生成する装置、例えば、ビデオカメラと接続するための通信インターフェース及びその制御回路を有してもよい。そのような通信インターフェースは、例えば、Universal Serial Bus（ユニバーサル・シリアル・バス、USB）とすることができる。 The communication interface unit 102 may include a communication interface for connecting the computer 100 to a device that generates moving image data, for example, a video camera, and a control circuit thereof. Such a communication interface can be, for example, Universal Serial Bus (Universal Serial Bus, USB).

さらに、通信インターフェース部１０２は、イーサネット（登録商標）などの通信規格に従った通信ネットワークに接続するための通信インターフェース及びその制御回路を有してもよい。 Furthermore, the communication interface unit 102 may include a communication interface for connecting to a communication network according to a communication standard such as Ethernet (registered trademark) and a control circuit thereof.

この場合には、通信インターフェース部１０２は、通信ネットワークに接続された他の機器から、符号化する動画像データを取得し、それらのデータをプロセッサ１０５へ渡す。また通信インターフェース部１０２は、プロセッサ１０５から受け取った、符号化動画像データを通信ネットワークを介して他の機器へ出力してもよい。 In this case, the communication interface unit 102 acquires moving image data to be encoded from another device connected to the communication network, and passes the data to the processor 105. Further, the communication interface unit 102 may output the encoded moving image data received from the processor 105 to another device via a communication network.

記憶部１０３は、例えば、読み書き可能な半導体メモリと読み出し専用の半導体メモリとを有する。そして記憶部１０３は、プロセッサ１０５上で実行される、動画像符号化処理を実行するためのコンピュータプログラム、及びこれらの処理の途中または結果として生成されるデータを記憶する。 The storage unit 103 includes, for example, a readable / writable semiconductor memory and a read-only semiconductor memory. The storage unit 103 stores a computer program for executing a moving image encoding process executed on the processor 105, and data generated during or as a result of these processes.

記憶媒体アクセス装置１０４は、例えば、磁気ディスク、半導体メモリカード及び光記憶媒体といった記憶媒体１０６にアクセスする装置である。記憶媒体アクセス装置１０４は、例えば、記憶媒体１０６に記憶されたプロセッサ１０５上で実行される、動画像符号化処理用のコンピュータプログラムを読み込み、プロセッサ１０５に渡す。 The storage medium access device 104 is a device that accesses a storage medium 106 such as a magnetic disk, a semiconductor memory card, and an optical storage medium. For example, the storage medium access device 104 reads a computer program for moving image encoding processing executed on the processor 105 stored in the storage medium 106 and passes the computer program to the processor 105.

プロセッサ１０５は、上記の実施形態または変形例による動画像符号化処理用コンピュータプログラムを実行することにより、符号化動画像データを生成する。そしてプロセッサ１０５は、生成された符号化動画像データを記憶部１０３に保存し、または通信インターフェース部１０２を介して他の機器へ出力する。 The processor 105 generates encoded moving image data by executing the computer program for moving image encoding processing according to the above-described embodiment or modification. The processor 105 stores the generated encoded moving image data in the storage unit 103 or outputs it to another device via the communication interface unit 102.

なお、動画像符号化装置１の各部の機能をプロセッサ上で実行可能なコンピュータプログラムは、コンピュータによって読み取り可能な媒体に記録された形で提供されてもよい。ただし、そのような記憶媒体には、搬送波は含まれない。 Note that the computer program capable of executing the functions of the respective units of the moving image encoding device 1 on the processor may be provided in a form recorded on a computer-readable medium. However, such a storage medium does not include a carrier wave.

ここに挙げられた全ての例及び特定の用語は、読者が、本発明及び当該技術の促進に対する本発明者により寄与された概念を理解することを助ける、教示的な目的において意図されたものであり、本発明の優位性及び劣等性を示すことに関する、本明細書の如何なる例の構成、そのような特定の挙げられた例及び条件に限定しないように解釈されるべきものである。本発明の実施形態は詳細に説明されているが、本発明の精神及び範囲から外れることなく、様々な変更、置換及び修正をこれに加えることが可能であることを理解されたい。 All examples and specific terms listed herein are intended for instructional purposes to help the reader understand the concepts contributed by the inventor to the present invention and the promotion of the technology. It should be construed that it is not limited to the construction of any example herein, such specific examples and conditions, with respect to showing the superiority and inferiority of the present invention. Although embodiments of the present invention have been described in detail, it should be understood that various changes, substitutions and modifications can be made thereto without departing from the spirit and scope of the present invention.

１動画像符号化装置
１０分割部
１１複雑度算出部
１２予測ブロック生成部
１３サイズモード選択部
１４符号化部
１５予測誤差算出部
１６直交変換部
１７量子化部
１８復号部
１９記憶部
２０可変長符号化部
１００コンピュータ
１０１ユーザインターフェース部
１０２通信インターフェース部
１０３記憶部
１０４記憶媒体アクセス装置
１０５プロセッサ
１０６記憶媒体 DESCRIPTION OF SYMBOLS 1 Moving image encoder 10 Divider 11 Complexity calculator 12 Prediction block generator 13 Size mode selector 14 Encoder 15 Prediction error calculator 16 Orthogonal transformer 17 Quantizer 18 Decoder 19 Storage unit 20 Variable length Encoding unit 100 Computer 101 User interface unit 102 Communication interface unit 103 Storage unit 104 Storage medium access device 105 Processor 106 Storage medium

Claims

A moving image encoding apparatus for encoding a picture included in moving image data,
A dividing unit for dividing the picture into a plurality of blocks;
A complexity calculator that calculates the complexity representing the complexity of the scene shown in the encoding target block of the plurality of blocks;
A prediction block generation unit that generates a prediction block of the encoding target block from an encoding target picture including the encoding target block, or a picture encoded before the encoding target picture;
For each of a plurality of different orthogonal transform sizes, the evaluation value indicating the code amount when the encoding target block is orthogonally transformed in units of sub-blocks having the orthogonal transform size is reduced as the complexity decreases. A size selection unit that calculates weighted so that the evaluation value becomes smaller as the transform size, and selects an orthogonal transform size that minimizes the evaluation value;
An encoding unit that encodes an orthogonal transform coefficient obtained by orthogonal transform of a prediction error image between the encoding target block and the prediction block in units of sub-blocks having the selected orthogonal transform size;
A moving picture encoding apparatus having:

The evaluation value includes a first term that increases as the orthogonal transform size decreases.
The moving image encoding apparatus according to claim 1, wherein the size selection unit calculates the evaluation value by weighting the first term with a weighting factor that decreases as the complexity decreases.

The video code according to claim 2, wherein the term includes information defined for each of the plurality of sub-blocks when the encoding target block is divided into sub-blocks having the orthogonal transform size. Device.

4. The moving picture encoding apparatus according to claim 3, wherein the information defined for each of the plurality of sub-blocks includes a flag indicating whether or not the sub-block has the orthogonal transform coefficient whose absolute value is not 0. 5. .

The evaluation value includes a second term representing a code amount for each subblock when the encoding target block is divided into subblocks having the orthogonal transform size,
The complexity calculation unit calculates the complexity for each sub-block when the encoding target block is divided into sub-blocks having the orthogonal transform size;
The size selection unit calculates the evaluation value by weighting the second term with a weighting factor that increases as the complexity calculated for the subblock decreases for each subblock. Video encoding device.

The size selection unit includes a first evaluation value that is the evaluation value when the size of the encoding target block is the orthogonal transform size, the encoding target block into four first sub-blocks, etc. The second evaluation value, which is the evaluation value when the size of the first sub-block when divided is the orthogonal transform size, is calculated, and the first evaluation value is the second evaluation value. When smaller than the evaluation value, the size of the encoding target block is selected as the orthogonal transform size,
On the other hand, when the second evaluation value is smaller than the first evaluation value, the complexity calculation unit sets the four first sub-blocks as the encoding target blocks to the complexity calculation unit. And calculating each of the four first sub-blocks as the coding target block, calculating the first evaluation value and the second evaluation value, and calculating the first evaluation value and the second evaluation value. The moving image encoding apparatus according to claim 1, wherein the orthogonal transform size is selected according to a comparison result of second evaluation values.

A moving image encoding apparatus for encoding a picture included in moving image data,
Dividing the picture into a plurality of blocks;
Calculating the complexity representing the complexity of the scene shown in the encoding target block of the plurality of blocks;
Generating a prediction block of the encoding target block from an encoding target picture including the encoding target block or a picture encoded before the encoding target picture;
For each of a plurality of different orthogonal transform sizes, the evaluation value indicating the code amount when the encoding target block is orthogonally transformed in units of sub-blocks having the orthogonal transform size is reduced as the complexity decreases. Calculate by weighting so that the evaluation value becomes smaller as the transform size, select the orthogonal transform size that minimizes the evaluation value,
Encoding an orthogonal transform coefficient obtained by orthogonally transforming a prediction error image between the encoding target block and the prediction block in units of subblocks having the selected orthogonal transform size;
A moving picture encoding method including the above.

A moving image encoding computer program for encoding a picture included in moving image data,
Dividing the picture into a plurality of blocks;
Calculating the complexity representing the complexity of the scene shown in the encoding target block of the plurality of blocks;
Generating a prediction block of the encoding target block from an encoding target picture including the encoding target block or a picture encoded before the encoding target picture;
For each of a plurality of different orthogonal transform sizes, the evaluation value indicating the code amount when the encoding target block is orthogonally transformed in units of sub-blocks having the orthogonal transform size is reduced as the complexity decreases. Calculate by weighting so that the evaluation value becomes smaller as the transform size, select the orthogonal transform size that minimizes the evaluation value,
Encoding an orthogonal transform coefficient obtained by orthogonally transforming a prediction error image between the encoding target block and the prediction block in units of subblocks having the selected orthogonal transform size;
A computer program that causes a computer to execute the operation.