JP2012175548A

JP2012175548A - Moving image encoding apparatus and moving image encoding method

Info

Publication number: JP2012175548A
Application number: JP2011037341A
Authority: JP
Inventors: Masashi Takahashi; 昌史高橋; Muneaki Yamaguchi; 宗明山口; Nobuhiro Chihara; 信博知原
Original assignee: Hitachi Kokusai Electric Inc
Current assignee: Kokusai Denki Electric Inc
Priority date: 2011-02-23
Filing date: 2011-02-23
Publication date: 2012-09-10
Also published as: WO2012114561A1

Abstract

【課題】少ない計算量のもとで符号量を抑えながら高画質の画像が処理できるようにした動画像符号化装置及び動画像符号化方法を提供すること
【解決手段】原画像３０１を入力し、予測した符号化対象画像と原画像との差分を符号化して符号化ストリーム３１１を得るようにした動画像符号化装置において、符号化対象画像と同一の画像を参照して符号化対象画像を予測する画面内予測部３０５と、符号化対象画像とは別の画像を参照して符号化対象画像を予測する画面間予測部３０６と、前記差分の符号化に際して画面内予測部３０５と画面間予測部３０６の何れによる予測結果を用いるのかを判定するモード選択部３０７とを設け、モード選択部３０７は、画面間予測時のパラメータ値に基づいて前記判定を行うようにしたもの。
【選択図】図３A moving picture coding apparatus and a moving picture coding method capable of processing a high-quality image while suppressing a code amount with a small amount of calculation are provided. In the moving picture encoding apparatus in which the difference between the predicted encoding target image and the original image is encoded to obtain the encoded stream 311, the encoding target image is referred to by referring to the same image as the encoding target image. An intra-screen prediction unit 305 that performs prediction, an inter-screen prediction unit 306 that predicts an encoding target image with reference to an image different from the encoding target image, and an intra-screen prediction unit 305 and an inter-screen when encoding the difference A mode selection unit 307 that determines which prediction result of the prediction unit 306 is used is provided, and the mode selection unit 307 performs the determination based on a parameter value at the time of inter-screen prediction.
[Selection] Figure 3

Description

本発明は、動画像を符号化する動画像符号化技術に係り、特に、イントラ予測とインター予測を切り替える方式の動画像符号化装置及び動画像符号化方法に関する。 The present invention relates to a moving picture coding technique for coding a moving picture, and particularly to a moving picture coding apparatus and a moving picture coding method that switch between intra prediction and inter prediction.

大容量の動画像情報をデジタルデータ化して記録、伝達する手法として、従来からＭＰＥＧ(Moving Picture Experts Group)などの符号化方式が策定され、これらは、更にＭＰＥＧ−１、ＭＰＥＧ−２、ＭＰＥＧ−４、Ｈ．２６４／ＡＶＣ(Advanced Video Coding)などの国際標準化された規格の符号化方式となっている。
そして、これらの符号化方式は、デジタル衛星放送やＤＶＤ、ブルーレイ(Blu-ray)レコーダ、携帯電話、デジタルカメラ、地上デジタル放送などにおける映像コンテンツの符号化方式として採用され、現在、ますます利用の範囲が広がり、身近なものとなってきている。 As a technique for recording and transmitting large-capacity moving picture information as digital data, encoding methods such as MPEG (Moving Picture Experts Group) have been conventionally established, and these are further encoded by MPEG-1, MPEG-2, MPEG- 4. H. H.264 / AVC (Advanced Video Coding) and other international standardized encoding methods.
These encoding methods have been adopted as encoding methods for video content in digital satellite broadcasting, DVDs, Blu-ray recorders, mobile phones, digital cameras, terrestrial digital broadcasting, etc. The range is expanding and becoming familiar.

ところで、これらの規格では、符号化処理が完了した画像情報(復号画像)を利用し、符号化対象画像を複数の画素からなるブロック単位で予測して、原画像との差分(予測差分)を符号化し、これにより動画像の持つ冗長性を排除し、符号量が減らせるようにしているが、このときの符号化対象画像のブロック単位での予測には、同じ画面内の周辺領域を参照して行う画面内予測(以下、イントラ予測という)と、対象画像とは異なる画像を参照して行う画面間予測(以下、インター予測という)の２種の方法があり、これらは画像の性質に応じて切り替えることができる。 By the way, in these standards, the image information (decoded image) for which the encoding process has been completed is used, the image to be encoded is predicted in units of blocks including a plurality of pixels, and a difference (prediction difference) from the original image is calculated. Encoding, thereby eliminating the redundancy of moving images and reducing the amount of code, but refer to the peripheral area in the same screen for prediction in block units of the encoding target image at this time In-screen prediction (hereinafter referred to as intra prediction) and inter-screen prediction (hereinafter referred to as inter prediction) performed with reference to an image different from the target image. It can be switched accordingly.

そして、このときの予測手段の切り替えは、例えばＨ．２６４／ＡＶＣやＭＰＥＧ−４などでは１６×１６画素で構成されるマクロブロック単位で行うが、この場合の切り替え方法について、規格では特段の定めはない。しかし、多くの場合、原画像と予測画像の画素値差分和ＳＡＤ(Sum of Absolute Differences)に基づくコスト関数を用いて評価する方法が一般的である。 The switching of the prediction means at this time is, for example, H.264. H.264 / AVC, MPEG-4, and the like are performed in units of macroblocks configured of 16 × 16 pixels, but the switching method in this case is not particularly defined in the standard. However, in many cases, an evaluation method using a cost function based on the sum of pixel value differences SAD (Sum of Absolute Differences) between the original image and the predicted image is generally used.

ところが、例えばマクロブロック単位で並列化処理を行うパイプライン構造を有するエンコーダの場合、対象マクロブロックの予測手段を決定する時点では、まだ、周辺でのマクロブロックの符号化が完了していないことが多く、この場合、周辺ブロックの復号画像を用いるイントラ予測の予測画像を生成することができず、従ってＳＡＤが計算できない。 However, in the case of an encoder having a pipeline structure that performs parallel processing in units of macroblocks, for example, encoding of macroblocks in the vicinity is not yet completed at the time of determining the target macroblock prediction means. In many cases, a prediction image of intra prediction using a decoded image of a peripheral block cannot be generated, and therefore SAD cannot be calculated.

そこで、予測手段の決定に際して、上記のように周辺ブロックの「復号画像」を用いるのではなく、周辺ブロックの「原画像」を用いてイントラ予測を行うという、いわゆる擬似イントラ予測により得られたＳＡＤ(擬似ＳＡＤという)を用いてコスト計算を行う方法が従来技術として知られている(例えば、特許文献１、２、３などを参照)。 Therefore, when determining the prediction means, instead of using the “decoded image” of the neighboring block as described above, the SAD obtained by so-called pseudo intra prediction in which the “original image” of the neighboring block is used for intra prediction. A method of performing cost calculation using (referred to as pseudo SAD) is known as a prior art (see, for example, Patent Documents 1, 2, and 3).

特開２００７−２８８４７３号公報JP 2007-288473 A 特開２００８−２５２３４６号公報JP 2008-252346 A 特開２００９−８１８３０号公報JP 2009-81830 A

上記従来技術は、擬似ＳＡＤの値がＳＡＤの値からかけ離れてしまう場合がある点に配慮がされておらず、予測手段の選択を誤って符号化効率が大幅に低下してしまうという問題があった。
従来技術の場合、特に低ビットレート帯域では量子化誤差の影響でＳＡＤと擬似ＳＡＤの値がかけ離れたものとなってしまい、このため予測手段の選択に必要な判定、いわゆるイントラ／インター判定(イントラ・インター判定ともいう)に誤りが生じ、この結果、符号化効率が大幅に低下してしまうのである。 The above-mentioned prior art does not take into consideration that the pseudo SAD value may be far from the SAD value, and there is a problem that the encoding efficiency is greatly reduced due to erroneous selection of the prediction means. It was.
In the case of the prior art, particularly in the low bit rate band, the values of SAD and pseudo-SAD are far apart due to the influence of quantization error, and therefore, the determination necessary for selecting the prediction means, so-called intra / inter determination (intra / inter determination) An error occurs in (also referred to as inter determination), and as a result, the coding efficiency is greatly reduced.

詳しく説明すると、符号化ストリームが所望のビットレートになるように量子化パラメータの制御を行うレート制御機能を用いる場合、イントラ／インター判定を行う時点では対象マクロブロックにおける量子化パラメータの値が決定していない場合が多い。
一方、最適な予測手段については、それがビットレートに応じて異なることが多いため、イントラ／インター判定に用いるコスト関数にも量子化パラメータの項目を入れるのが通例である。 More specifically, when using a rate control function that controls the quantization parameter so that the encoded stream has a desired bit rate, the value of the quantization parameter in the target macroblock is determined at the time of performing the intra / inter determination. Often not.
On the other hand, since the optimum prediction means often varies depending on the bit rate, it is usual to include the quantization parameter item in the cost function used for intra / inter determination.

しかし、この結果、コスト計算時には量子化パラメータに関して仮の値を設定せざるを得ないが、ここで、この仮の値が、もしも実際の値と大きく異なったとすれば、適正な判定結果が得られなくなって符号化効率が低下してしまうことになる。
しかし、ここで仮の値とは、名の通り“仮り”のものであるから、推測によるしかなく、正しい設定は僥倖頼みになってしまい、この結果、従来技術では、符号化効率の低下という問題が生じてしまうのである。 However, as a result, a tentative value must be set for the quantization parameter when calculating the cost. However, if this tentative value is significantly different from the actual value, an appropriate judgment result is obtained. As a result, the encoding efficiency is lowered.
However, the provisional value here is, as the name suggests, "provisional", so it is only guessed, and the correct setting is relied upon. As a result, the conventional technology is said to reduce the encoding efficiency. Problems will arise.

一方、上記した符号化方式に関する規格の中には、マクロブロックを更に細かなブロックに分割して予測を行うものがあるが、この場合、分割パターンは多岐にわたり、最適な予測手段と分割パターンの組み合わせ(符号化モード)の決定には、膨大な計算量を要してしまうので、問題の解決にはならない。 On the other hand, some of the standards related to the above-mentioned encoding schemes perform prediction by dividing a macroblock into smaller blocks. In this case, there are a wide variety of division patterns, and optimal prediction means and division patterns are used. The determination of the combination (encoding mode) requires an enormous amount of calculation and does not solve the problem.

また、マクロブロック単位で並列化処理を行うビデオエンコーダの場合、隣接ブロックの符号化の完了が、対象ブロックの符号化モード選択時に得られないのでイントラ予測画像の生成ができず、従って、精度の高いイントラ／インター判定は望むべくも無い。 In addition, in the case of a video encoder that performs parallel processing in units of macroblocks, the completion of encoding of adjacent blocks cannot be obtained when the encoding mode of the target block is selected, so that an intra-predicted image cannot be generated. There is no high intra / inter determination.

また、レート制御を用いて量子化パラメータを制御する従来技術の場合、仮の量子化パラメータを設定して判定を行うことになるが、このときイントラ／インター判定の良否は量子化パラメータに大きく依存する。
従って、この場合、設定した仮の量子化パラメータが実際の量子化パラメータと異なったとすれば判定結果に誤りが生じ、この結果、画質が低下するという問題があった。 In addition, in the case of the prior art that controls the quantization parameter using rate control, the provisional quantization parameter is set and the determination is made. At this time, the quality of the intra / inter determination depends greatly on the quantization parameter. To do.
Therefore, in this case, if the set temporary quantization parameter is different from the actual quantization parameter, an error occurs in the determination result, resulting in a problem that the image quality is deteriorated.

このとき、Ｈ．２６４／ＡＶＣなどの新しい規格では、絵柄に応じてブロックサイズを変えることができる。
しかし、その反面、候補となる符号化モードの種類が多くなってしまうので、最適なモードを決定するためには、更に膨大な計算量が必要になってしまう。
本発明の目的は、少ない計算量のもとで符号量を抑えながら高画質の画像が処理できるようにした動画像符号化装置及び動画像符号化方法を提供することにある。 At this time, H.C. In new standards such as H.264 / AVC, the block size can be changed according to the design.
However, on the other hand, since the types of encoding modes that are candidates increase, an enormous amount of calculation is required to determine the optimum mode.
An object of the present invention is to provide a moving picture coding apparatus and a moving picture coding method capable of processing a high-quality image while suppressing a coding amount under a small calculation amount.

上記目的は、予測した符号化対象画像と原画像との差分を符号化する方式の動画像符号化装置において、前記符号化対象画像の予測を、当該符号化対象画像とは別の画像を参照して行う画面間予測手段と、前記符号化対象画像の予測を、当該符号化対象画像と同一の画像を参照して行う画面内予測手段と、前記差分の符号化に際して前記画面間予測手段と前記画面内予測手段の何れによる予測結果を用いるのかを判定するモード選択手段とを設け、前記モード選択手段は、画面間予測時のパラメータ値に基づいて前記判定を行うようにして達成される。
同じく、上記目的は、予測した符号化対象画像と原画像との差分を符号化する方式の動画像符号化方法において、前記予測した符号化対象画像と原画像との差分を符号化する際、前記符号化対象画像の予測を、当該符号化対象画像とは別の画像を参照して行う場合と、前記符号化対象画像の予測を、当該符号化対象画像と同一の画像を参照して行う場合とを、画面間予測時のパラメータ値に設定した閾値に基づいて選択するようにしても達成される。 The above object is to provide a video encoding apparatus that encodes a difference between a predicted encoding target image and an original image, and refers to an image different from the encoding target image for prediction of the encoding target image. Inter-screen prediction means, prediction of the encoding target image with reference to the same image as the encoding target image, intra-screen prediction means when encoding the difference, Mode selection means for determining which of the intra-screen prediction means uses the prediction result is provided, and the mode selection means is achieved by performing the determination based on a parameter value at the time of inter-screen prediction.
Similarly, the above object is to encode a difference between the predicted encoding target image and the original image in a moving image encoding method that encodes a difference between the predicted encoding target image and the original image. The prediction of the encoding target image is performed with reference to an image different from the encoding target image, and the prediction of the encoding target image is performed with reference to the same image as the encoding target image. This is also achieved by selecting the case based on the threshold value set as the parameter value at the time of inter-screen prediction.

本発明が導入した閾値は、絵柄やビットレートへの依存性が極めて低く、定数として設定しても問題がないため、レート制御を用いる場合など量子化パラメータがまだ決まっていない状況下でも客観的にみて最適なイントラ／インター判定が行える。また、ブロックサイズを決定する前に上記イントラ／インター判定を行うことにより符号化モードの絞込みを行うことが可能となり、符号化モード選択処理に必要な処理量を低減することができる。
従って、本発明によれば、少ない計算量と少ない符号量で高画質の画像の提供が可能な動画像符号化技術を提供することができる。 The threshold value introduced by the present invention has very low dependency on the picture and bit rate, and can be set as a constant, so there is no problem, so even if the quantization parameter is not yet determined, such as when using rate control, it is objective. This makes it possible to determine the optimal intra / inter determination. In addition, by performing the intra / inter determination before determining the block size, it is possible to narrow down the encoding mode, and the processing amount required for the encoding mode selection process can be reduced.
Therefore, according to the present invention, it is possible to provide a moving picture coding technique capable of providing a high-quality image with a small calculation amount and a small code amount.

本発明による画像符号化装置の実施形態１を示すブロック図である。It is a block diagram which shows Embodiment 1 of the image coding apparatus by this invention. 本発明の実施形態１におけるモード選択部のブロック図である。It is a block diagram of the mode selection part in Embodiment 1 of this invention. 本発明による画像符号化装置の実施形態２を示すブロック図である。It is a block diagram which shows Embodiment 2 of the image coding apparatus by this invention. 本発明の実施形態２におけるモード選択部のブロック図である。It is a block diagram of the mode selection part in Embodiment 2 of this invention. 動画像符号化処理の説明図である。It is explanatory drawing of a moving image encoding process. 画面間予測処理の説明図である。It is explanatory drawing of the prediction process between screens. 画面内予測処理の説明図である。It is explanatory drawing of the prediction process in a screen. 本発明における画面内予測と画面間予測におけるカテゴリ分類の説明図である。It is explanatory drawing of the category classification | category in the prediction in a screen in this invention, and the prediction between screens. 画面内予測と画面間予測の性能に関する説明図である。It is explanatory drawing regarding the performance of the prediction in a screen, and the prediction between screens. 画面内予測と画面間予測の性質を表す実データの一例を示す特性図である。It is a characteristic view which shows an example of the actual data showing the property of the prediction in a screen, and the prediction between screens. 画面内予測と画面間予測の性質を表す実データの他の一例を示す特性図である。It is a characteristic view which shows another example of the actual data showing the property of the prediction in a screen, and the prediction between screens. 本発明におけるイントラ／インター判定の説明図である。It is explanatory drawing of the intra / inter determination in this invention. 並列処理を行う際に発生する問題点の説明図である。It is explanatory drawing of the problem which generate | occur | produces when performing parallel processing. 微分フィルタの画素位置に関する説明図である。It is explanatory drawing regarding the pixel position of a differential filter. 微分フィルタの一例を示す説明図である。It is explanatory drawing which shows an example of a differential filter. 微分フィルタの他の一例を示す説明図である。It is explanatory drawing which shows another example of a differential filter. 本発明の実施形態１によるモード選択処理の流れ図である。It is a flowchart of the mode selection process by Embodiment 1 of this invention. 一般的手法と本実施例によるモード選択処理の説明図である。It is explanatory drawing of the mode selection process by a general method and a present Example. 本発明の実施形態２によるモード選択処理の流れ図である。It is a flowchart of the mode selection process by Embodiment 2 of this invention.

以下、本発明に係る動画像符号化装置及び動画像符号化方法について、図示の実施形態により説明する。
このとき、本発明においては、例えば上記したＨ．２６４／ＡＶＣなどの動画像符号化方式において、動画像の持つ冗長性を抑え、符号量を削減するため、符号化処理が完了した画像情報を利用して符号化対象画像を予測し、原画像との差分(予測差分という)を符号化するようにしている。 Hereinafter, a moving picture coding apparatus and a moving picture coding method according to the present invention will be described with reference to illustrated embodiments.
At this time, in the present invention, for example, the above-described H.264. In a moving image encoding method such as H.264 / AVC, in order to suppress redundancy of a moving image and reduce a code amount, an encoding target image is predicted using image information for which encoding processing has been completed, and an original image The difference (referred to as prediction difference) is encoded.

そこで、このときの本発明における符号化処理の概要について説明する。
まず、ここでは、動画像の局所的性質を利用するため、画像を細かく分割したブロック単位で予測を行う場合について説明する。すなわち、図５に示すように、対象画像５０１に対して、ラスタースキャンの順序に従い、１６×１６画素で構成されるマクロブロック５０２単位で予測を行うのである。 An outline of the encoding process in the present invention at this time will be described.
First, here, a case will be described in which prediction is performed in units of blocks obtained by finely dividing an image in order to use the local properties of a moving image. That is, as shown in FIG. 5, prediction is performed on the target image 501 in units of macroblocks 502 composed of 16 × 16 pixels according to the raster scan order.

この予測手段としては、図示のように、インター予測(画面間予測)５０３と、イントラ予測(画面内予測)５０４の２種類に大別される。
このとき、図示のように、このマクロブロックをさらに細かなブロックに分割して予測するようにしている。
そして、符号化時には、予測手段に加え、いずれの分割パターンを用いるかをマクロブロック単位で決定し、予測手段と分割パターンの組み合わせを「符号化モード」とし、その識別子をヘッダ情報として符号化する。 As shown in the figure, the prediction means is roughly divided into two types, inter prediction (inter-screen prediction) 503 and intra prediction (intra-screen prediction) 504.
At this time, as shown in the figure, this macro block is divided into smaller blocks and predicted.
At the time of encoding, in addition to the prediction means, which division pattern is to be used is determined for each macroblock, the combination of the prediction means and the division pattern is set to “encoding mode”, and the identifier is encoded as header information. .

このインター予測は、更に５０５で示すように、１枚の参照画像から予測するインター予測Ｐ(Predictive)と、２枚の参照画像から予測可能なインター予測Ｂ(Bi-predictive)に分けられ、更に、このインター予測には、動きベクトルや予測差分を符号化しないスキップモード、それにダイレクトモードが含まれる。
そして、どのモードを利用する場合もイントラ／インター判定を行うことができ、従って、ここでは、これらを特に区別せず、纏めてインター予測と呼ぶ。 This inter prediction is further divided into inter prediction P (Predictive) predicted from one reference image and inter prediction B (Bi-predictive) predictable from two reference images, as indicated by 505, and This inter prediction includes a skip mode in which no motion vector or prediction difference is encoded, and a direct mode.
In any mode, intra / inter determination can be performed. Therefore, these are not particularly distinguished here and are collectively referred to as inter prediction.

次に、図６は、Ｈ．２６４／ＡＶＣによる画面間予測処理の動作について概念的に示したもので、図示のように、画面間予測を行う際には、符号化対象画像６０３と同じ映像６０１に含まれる符号化済みの画像の復号画像を参照画像６０２とし、対象画像中の対象ブロック６０４と相関の高いブロック(予測画像)６０５を参照画像中から探索(動き探索という)する。
このとき、両ブロックの差分として計算される予測差分に加え、予測に必要なヘッダ情報として、両ブロックの座標値の差分として表される動きベクトル６０６も併せて符号化する。 Next, FIG. 264 / AVC conceptually shows the operation of inter-screen prediction processing. As shown in the figure, when performing inter-screen prediction, an encoded image included in the same video 601 as the encoding target image 603 is shown. The decoded image is a reference image 602, and a block (predicted image) 605 having a high correlation with the target block 604 in the target image is searched from the reference image (referred to as motion search).
At this time, in addition to the prediction difference calculated as the difference between both blocks, the motion vector 606 expressed as the difference between the coordinate values of both blocks is also encoded as header information necessary for prediction.

次に、図７の(a)は、同じくＨ．２６４／ＡＶＣによる画面内予測(イントラ予測)処理について概念的に示したもので、この画面内予測では、符号化対象ブロックＡの左、左上、上、右上に隣接する符号化済みブロックＢ、Ｃ、Ｄ、Ｅの復号画像を用いて予測を行う。
すなわち、予測には、これらのブロックＢ〜Ｅに含まれ、符号化対象ブロックＡに接している１３個の復号画素７０１を利用する。このとき予測方向ベクトル７０２を傾きとする同一直線上の画素は全て同一の参照画素から予測される。 Next, (a) in FIG. 264 / AVC conceptually shows intra-frame prediction (intra prediction) processing. In this intra-screen prediction, encoded blocks B, C adjacent to the left, upper left, upper, upper right of the encoding target block A are shown. , D and E are used for prediction.
That is, for the prediction, 13 decoded pixels 701 that are included in these blocks B to E and are in contact with the encoding target block A are used. At this time, all pixels on the same straight line with the prediction direction vector 702 as an inclination are predicted from the same reference pixel.

ここで、Ｈ．２６４／ＡＶＣの場合、縦、横、斜めなど、８種類の予測方向の中から最適なものをブロック単位で選択することができる。
このとき、上記のような方向性に基づく予測モードに加え、参照画素の平均値によって符号化対象ブロックに含まれる全ての画素を予測するＤＣ予測も、予測モード２として用意され、８種類の予測方向に付加されている。
そして、上記９種類の予測モード７０３の中から何れのモードを選択したのかを示す情報は、ヘッダ情報として、予測差分と共に符号化される。 Here, H. In the case of H.264 / AVC, an optimal one can be selected in units of blocks from eight types of prediction directions such as vertical, horizontal, and diagonal.
At this time, in addition to the prediction mode based on the directionality as described above, DC prediction for predicting all the pixels included in the block to be encoded based on the average value of the reference pixels is also prepared as the prediction mode 2, and eight types of prediction are performed. It has been added to the direction.
Information indicating which mode is selected from the nine types of prediction modes 703 is encoded together with the prediction difference as header information.

ちなみに、擬似イントラ予測の場合は、図７の(b)に示すようになり、この場合、参照画像と対象画素が原画像に由来するものになっている。そして、この点が図７(a)のイントラ予測の場合とは異なっていることになる。 Incidentally, in the case of pseudo intra prediction, it becomes as shown in FIG. 7B, and in this case, the reference image and the target pixel are derived from the original image. This point is different from the case of intra prediction in FIG.

次に、図８は、イントラ予測とインター予測に関する重要な性質を概念的に表したグラフで、ここでは、高い符号化効率を実現できる理想的な符号化モード選択方式によりイントラ／インター判定を行った場合のＳＡＤの値に対して、逆の判定(誤った判定)を行った場合に増加するＳＡＤの値を表したもので、横軸にはインター予測時のＳＡＤを表わす InterＳＡＤ８０１が示され、縦軸には判定を誤った場合のＳＡＤの増加分の平均値８０２がＹとして示されている。 Next, FIG. 8 is a graph conceptually showing important properties related to intra prediction and inter prediction. Here, intra / inter determination is performed by an ideal coding mode selection method capable of realizing high coding efficiency. In this case, the SAD value that is increased when the reverse determination (incorrect determination) is performed with respect to the SAD value in the case of SAD, and the horizontal axis indicates the Inter SAD 801 indicating the SAD at the time of inter prediction, On the vertical axis, the average value 802 of the increase in SAD when the determination is wrong is shown as Y.

そして、まず、イントラ予測特性８０３は、イントラ予測が選ばれるべき場合に誤ってインター予測が選択された場合のＳＡＤ増加分を表わし、インター予測特性８０４は、インター予測が選ばれるべき場合にもかかわらず、誤ってイントラ予測が選択された場合のＳＡＤ増加分を表している。
従って、この図８のグラフから、各マクロブロックは、インター予測のパラメータとして与えられる InterＳＡＤの値によって３種類のカテゴリ(category：範疇)に分けられることが判る。 First, the intra prediction characteristic 803 represents the SAD increase when the inter prediction is erroneously selected when the intra prediction is to be selected, and the inter prediction characteristic 804 is related to the case where the inter prediction is to be selected. The SAD increase when the intra prediction is erroneously selected is shown.
Therefore, it can be seen from the graph of FIG. 8 that each macroblock is divided into three categories according to the value of InterSAD given as an inter prediction parameter.

まず、InterＳＡＤ８０１が小さいカテゴリ１の領域８０５では、ＳＡＤ増加分８０２は、イントラ予測特性８０３の方が少ない。
従って、このカテゴリ１の領域８０５では、イントラ予測特性８０３を選択した方が、Inter予測の精度が高く有利な領域であることが判る。
次に、InterＳＡＤ８０１が或る程度増加したカテゴリ２の領域８０６では、一方のイントラ予測特性８０３は増加し、他方のインター予測特性８０４は減少し、半ばで交差しているので、どちらの特性が優れているとは一概には言えない領域である。 First, in the category 1 region 805 having a small Inter SAD 801, the SAD increase 802 is smaller in the intra prediction characteristics 803.
Therefore, it can be seen that, in the category 805, the intra prediction characteristic 803 is more advantageous in that the accuracy of the Inter prediction is higher.
Next, in the category 2 region 806 in which the InterSAD 801 is increased to some extent, one intra prediction characteristic 803 is increased, and the other inter prediction characteristic 804 is decreased and intersects in the middle. It is an area that cannot be generally said.

そして、InterＳＡＤ８０１がかなり大きくなったカテゴリ３の領域８０７では、ＳＡＤ増加分８０２は、インター予測特性８０４の方が少ない。
従って、このカテゴリ３の領域８０７では、インター予測特性８０４を選択した方が、Inter予測の精度が高いので有利な領域であることが判る。
そこで、イントラ予測特性８０３とインター予測特性８０４が交差している点Ｘの両側で、イントラ予測特性８０３とインター予測特性８０４に有意な差が表れている点を任意に選定し、夫々閾値Ｔ1 と閾値Ｔ2 に設定する。 In the category 3 area 807 in which the Inter SAD 801 is considerably large, the SAD increase 802 is smaller in the inter prediction characteristic 804.
Therefore, it can be seen that the category 3 region 807 is more advantageous when the inter prediction characteristic 804 is selected because the accuracy of the inter prediction is higher.
Therefore, points where a significant difference appears between the intra prediction characteristics 803 and the inter prediction characteristics 804 on both sides of the point X where the intra prediction characteristics 803 and the inter prediction characteristics 804 intersect with each other are respectively selected as threshold values T1 and T1. Set to threshold T2.

そうすると、ここで、特に重要な点は、これら閾値Ｔ1 と閾値Ｔ2 は、対象動画像の絵柄やＱＰ(Quantization Parameter：量子化パラメータ)に対する依存性が極めて低く、どのような状況のもとで定数として扱っても何ら問題がないということである。
ここで、閾値Ｔ1 と閾値Ｔ2 が定数として扱える理由について、更に図９により概念的に説明する。 In this case, the particularly important point here is that these threshold values T1 and T2 have extremely low dependence on the picture and QP (Quantization Parameter) of the target moving image, and are constant under any circumstances. It is that there is no problem even if treated as.
Here, the reason why the threshold value T1 and the threshold value T2 can be treated as constants will be conceptually described with reference to FIG.

まず、図９(a)に示すように、カテゴリ１の領域に分類されるのは、一般的には、物体が静止していたり、平行移動をしていた場合など、インター予測がよく当たるマクロブロックである(９０１)。
この場合、InterＳＡＤの値の範囲には、ＱＰの値により、かなり大きな幅が生じるが(９０２)、どの帯域でも InterＳＡＤの値は、一般的に小さくなる。 First, as shown in FIG. 9 (a), a category 1 is generally classified as a macro that is often used for inter prediction when the object is stationary or moved in parallel. It is a block (901).
In this case, the InterSAD value range has a considerably large width depending on the QP value (902), but the InterSAD value is generally small in any band.

次に、図９(b)に示すように、カテゴリ２(９０３)には、３次元移動や照明変化がある場合など、複雑な絵柄の画面間変化により、インター予測の性能が低下するマクロブロックが分類される。
この場合も、ＱＰによって InterＳＡＤの範囲に幅は生じるが(９０４)、InterＳＡＤの値は一般的に大き目の値となる。 Next, as shown in FIG. 9B, in the category 2 (903), a macroblock whose inter prediction performance is deteriorated due to a change between screens of a complicated pattern such as when there is a three-dimensional movement or a lighting change. Are classified.
In this case as well, the width of the range of InterSAD is generated by QP (904), but the value of InterSAD is generally a large value.

一方、図９(c)に示すように、カテゴリ３(９０５)には、オクリュージョン(occlusion)やシーンチェンジなどが発生して参照画像中に対象物体が存在せず、インター予測がほとんど機能しないマクロブロックが分類される。
この場合、InterＳＡＤの値には多少の幅は生じるものの(９０６)、その値はもはやＱＰに依存するとは限らない。 On the other hand, as shown in FIG. 9 (c), in category 3 (905), occlusion, scene change, etc. occur and the target object does not exist in the reference image, and inter prediction is almost functioning. Macroblocks that are not to be classified are classified.
In this case, the value of InterSAD is somewhat different (906), but the value is no longer dependent on QP.

以上のようにカテゴリ分類を行った結果から、予測精度の違いにより、一般的にカテゴリ内のＳＡＤの差よりもカテゴリ間のＳＡＤの差の方が大きくなり、カテゴリ１の InterＳＡＤの最大値はカテゴリ２の InterＳＡＤの最小値よりも小さくなり、このときカテゴリ２における InterＳＡＤの最大値は、カテゴリ３の最小値よりも小さくなることが判る(９０７)。 From the results of the category classification as described above, due to the difference in prediction accuracy, the difference in SAD between categories is generally larger than the difference in SAD within the category, and the maximum value of Category 1 Inter SAD is the category. 2 is smaller than the minimum value of InterSAD, and at this time, the maximum value of InterSAD in category 2 is smaller than the minimum value of category 3 (907).

このとき、InterＳＡＤは、一般的に次の通りになる。
ＳＡＤ１(ＱＰ：大)＜ＳＡＤ２(ＱＰ：小)
且つ、ＳＡＤ２(ＱＰ：大)＜ＳＡＤ３(ＱＰ：最小値)
従って、図１０(a)、(b)、(c)の何れの場合もＱＰの値に関わらず互いに識別可能であり、この結果、どのような状況下でもカテゴリの境界は固定であるとして問題なく、従って、閾値Ｔ1 と閾値Ｔ2 が定数として扱えることが判る。 At this time, the Inter SAD is generally as follows.
SAD1 (QP: large) <SAD2 (QP: small)
And SAD2 (QP: large) <SAD3 (QP: minimum value)
Accordingly, in any of FIGS. 10 (a), (b), and (c), they can be distinguished from each other regardless of the value of QP. As a result, the boundary of the category is fixed under any circumstances. Therefore, it can be seen that the threshold values T1 and T2 can be treated as constants.

ここで、図１０は、実際にある映像(Seq1：1080i, 4:2:2, 10bit)を符号化したときのデータを示したもので、図１０(a)は、Seq１において、ＱＰ＝１２の場合、図１０(b)は、同じくＱＰ＝２７の場合、そして、図１０(c)は、同じくＱＰ＝４２の場合であり、従って、ＱＰを変化させても、グラフの分布はほとんど変化しないことが判る。
次に、図１１は、上記とは全く別の映像(Seq2：1080i, 4:2:2, 10bit)を符号化したときのデータを示したもので、図１１(a)は、Seq１においてＱＰ＝１２の場合、図１１(b)は、同じくＱＰ＝２７の場合、そして、図１１(c)は、同じくＱＰ＝４２の場合である。 Here, FIG. 10 shows data when an actual video (Seq1: 1080i, 4: 2: 2, 10 bits) is encoded. FIG. 10 (a) shows QP = 12 in Seq1. In the case of FIG. 10B, FIG. 10B is also the case of QP = 27, and FIG. 10C is the case of QP = 42. Therefore, even if the QP is changed, the distribution of the graph is almost changed. I understand that I don't.
Next, FIG. 11 shows data when a completely different video (Seq2: 1080i, 4: 2: 2, 10 bits) is encoded. FIG. 11 (a) shows a QP in Seq1. FIG. 11B shows the case where QP = 27, and FIG. 11C shows the case where QP = 42.

図１１の場合、グラフの分布は図１０のときとは異なるものの、各カテゴリの性質は図１０の場合と同様であり、従って、カテゴリを分割するための閾値Ｔ1、Ｔ2 も、図１１の場合と同じ値で良いことが判る。
但し、レート制御機能を用いない場合など、イントラ／インター判定結果のＱＰへの依存性が問題にならない場合には、必ずしも閾値Ｔ1 と閾値Ｔ2 を固定する必要は無く、例えばＱＰに応じて閾値を変更させてやれば、符号化効率の多少の上昇が確認されている。 In the case of FIG. 11, although the distribution of the graph is different from that in FIG. 10, the properties of each category are the same as in FIG. 10, and therefore the threshold values T1 and T2 for dividing the categories are also in the case of FIG. It turns out that the same value as is good.
However, when the dependency on the QP of the intra / inter determination result does not matter, such as when the rate control function is not used, it is not always necessary to fix the threshold T1 and the threshold T2. For example, the threshold is set according to the QP. If changed, a slight increase in coding efficiency has been confirmed.

以上の結果、インター予測のＳＡＤに関して各マクロブロックを３種類のカテゴリに分割する際、定数である閾値Ｔ1 と閾値Ｔ2 を用いて分割しても、何らの不都合はなく、各々のカテゴリについて夫々有利になる予測手段が有意に選択できることが判る。
そこで、次に、これら閾値Ｔ1 と閾値Ｔ2 によりイントラ／インター判定を行う方法について、図１２により説明する。 As a result, when each macroblock is divided into three types of categories with respect to SAD for inter prediction, there is no inconvenience even if division is performed using threshold values T1 and T2 which are constants, and each category is advantageous. It can be seen that the prediction means to be can be selected significantly.
Therefore, a method for performing intra / inter determination based on these threshold values T1 and T2 will be described with reference to FIG.

まず、対象マクロブロックがカテゴリ１に属する場合、図示のようにインター予測が有利であるから、この場合、「ケース１」として、インター予測を選択する。
次に、カテゴリ２に属する場合には、何れの予測手段が有利なのか一概には言えないので、この場合、「ケース２」として、さらに詳細なイントラ／インター判定を行う。
そして、カテゴリ３に属する場合(ケース３)、イントラ予測が有利であるから、この場合、「ケース３」として、イントラ予測を選択するのである。 First, when the target macroblock belongs to category 1, inter prediction is advantageous as shown in the figure. In this case, inter prediction is selected as “Case 1”.
Next, when it belongs to category 2, it cannot be generally said which prediction means is advantageous. In this case, more detailed intra / inter determination is performed as “Case 2”.
In addition, since intra prediction is advantageous when belonging to category 3 (case 3), in this case, intra prediction is selected as “case 3”.

ところで、このイントラ予測を行うためには、周辺マクロブロックの符号化が完了し、参照画素として使用可能な復号画像が取得されている必要がある。
ここで、まず、イントラ予測を行う際に参照する周辺マクロブロックの位置関係について、図１３により説明する。
図１３の対象画像の中で、例えば対象ブロック(１３０１)においてイントラ予測を行うものとした場合、対象ブロックの左、左上、上、右上に位置する４個のマクロブロック(１３０２)を参照することになる。 By the way, in order to perform this intra prediction, it is necessary that encoding of neighboring macroblocks is completed and a decoded image that can be used as a reference pixel is acquired.
Here, first, the positional relationship between neighboring macroblocks referred to when performing intra prediction will be described with reference to FIG.
In the target image of FIG. 13, for example, when intra prediction is performed in the target block (1301), refer to the four macro blocks (1302) located on the left, upper left, upper, and upper right of the target block. become.

このため、これら４個のマクロブロック(１３０２)について符号化が完了していなければイントラ予測を行うことができないことになる。
しかして、このときの符号化は、ラスタースキャンの順序に従って行われるので、例えば逐次的に符号化を行うとした場合、対象ブロックの符号化に際して、これらの周辺マクロブロックに対する符号化は確実に完了しているので、周辺マクロブロックの復号画像を用いてイントラ予測を実行することができる。 For this reason, if the encoding is not completed for these four macroblocks (1302), intra prediction cannot be performed.
Therefore, since the encoding at this time is performed according to the order of the raster scan, for example, when encoding is performed sequentially, the encoding for these neighboring macro blocks is surely completed when the target block is encoded. Therefore, intra prediction can be executed using decoded images of neighboring macroblocks.

そして、符号化モード選択時にイントラ予測とインター予測の両方が可能な場合、一般的には候補となる全ての符号化モードにおいて一度予測処理を行い、次の数１式と数２式によりコスト値Ｃostを計算し、最も小さなコスト値を与える符号化モードを選択する。 If both intra prediction and inter prediction are possible when the encoding mode is selected, generally, prediction processing is performed once in all candidate encoding modes, and the cost value is calculated by the following equations 1 and 2. Cost is calculated and the coding mode that gives the smallest cost value is selected.

(数１)
(Equation 1)

(数２)
(Equation 2)

ここで、Ｄist は予測誤差、Ｒate は予測に伴うヘッダの符号量、Ｂは対象ブロックの原画像、Ｂ’は対象ブロックの予測画像とする。
このとき、weightは予測誤差と符号量がコスト値に寄与する割合を調整するための係数値で、符号化モードの種類や量子化パラメータの値に応じて統計的に決定する。
一方、ＳＡＤは以下の(数３)式により定義される。 Here, Dist is the prediction error, Rate is the code amount of the header accompanying the prediction, B is the original image of the target block, and B ′ is the predicted image of the target block.
At this time, weight is a coefficient value for adjusting the ratio of the prediction error and the code amount contributing to the cost value, and is statistically determined according to the type of the encoding mode and the value of the quantization parameter.
On the other hand, SAD is defined by the following (Equation 3).

(数３)
(Equation 3)

ただし、p[i,j]は対象ブロックＢにおける座標(i,j)の画素値を、q[i,j]は対象ブロックの予測画像Ｂ’における座標(i,j)の画素値を示している。
このときの画素値としては、輝度成分の値のみを利用しても良いし、輝度成分の値と色差成分に値を組み合わせても良い。
ここで、予測誤差関数として、(数２)式により与えられるものに代え、次の(数４)式から与えられるものを用いると、更に効果があるとされている。 However, p [i, j] indicates the pixel value of the coordinate (i, j) in the target block B, and q [i, j] indicates the pixel value of the coordinate (i, j) in the predicted image B ′ of the target block. ing.
As the pixel value at this time, only the value of the luminance component may be used, or a value may be combined with the value of the luminance component and the color difference component.
Here, it is said that the use of the prediction error function given by the following equation (4) instead of that given by the equation (2) is more effective.

(数４)
(Equation 4)

ここでＳＡＴＤは、アダマール変換係数絶対値誤差和(Sum of Hadamard Absolute Transformed Differences)を表し、対象ブロックの原画像と予測画像の差分値に対して、周波数変換方式の１種であるアダマール変換を施した後、各係数値の絶対値和を計算したものとして、以下の(数５)式により定義される。 Here, SATD represents the sum of Hadamard Absolute Transformed Differences, and the Hadamard transform, which is a type of frequency transform method, is applied to the difference value between the original image and the predicted image of the target block. After that, the absolute value sum of each coefficient value is calculated and defined by the following equation (5).

(数５)
(Equation 5)

ただし、Ｔr はブロックにアダマール変換を施す関数を表し、Ｔr(Ｂ)[a,b]は対象ブロックＢに対してアダマール変換を施した後の変換係数成分(ａ，ｂ)を表す。
このときのコスト関数に関しては、上記以外にも二乗誤差和(ＳＳＤ：Sum of Squared Differences)など対象ブロックの原画像と予測画像の類似性を反映できるものならどのような指標を用いてもよい。 Here, Tr represents a function for performing a Hadamard transform on the block, and Tr (B) [a, b] represents a transform coefficient component (a, b) after the Hadamard transform is performed on the target block B.
For the cost function at this time, any index other than the above may be used as long as it can reflect the similarity between the original image of the target block and the predicted image, such as sum of squared differences (SSD).

しかして、例えばマクロブロック単位で並列化処理を行うパイプライン構造等を有するエンコーダにおいては、対象マクロブロックのモード選択を行う時点では、またこれらの周辺マクロブロックの符号化が完了していないことが多い。
そして、この場合、周辺ブロックの復号画像を用いてイントラ予測を行うことができず、ＳＡＤなどの予測誤差を計算することができない。 Thus, for example, in an encoder having a pipeline structure that performs parallel processing in units of macroblocks, the encoding of these peripheral macroblocks may not be completed at the time of mode selection of the target macroblock. Many.
In this case, intra prediction cannot be performed using decoded images of neighboring blocks, and prediction errors such as SAD cannot be calculated.

そこで、この場合、符号化モード選択時には周辺ブロックの復号画像の代わりに、周辺ブロックの原画像を用いて擬似的にイントラ予測を行ってその際のＳＡＤ(擬似ＳＡＤ)に基づいてモードを決定し、周辺マクロブロックの符号化が完了した後で、改めて決定したモードで予測画像を生成し直して符号化する方法がよく利用される。 Therefore, in this case, when the encoding mode is selected, a pseudo intra prediction is performed using the original image of the peripheral block instead of the decoded image of the peripheral block, and the mode is determined based on the SAD (pseudo SAD) at that time. A method of generating a prediction image again and encoding it in a newly determined mode after encoding of neighboring macroblocks is often used.

しかしながら、この場合、量子化誤差の影響により参照画素の原画像と復号画像に差が生じ、特に低ビットレートではＳＡＤと擬似ＳＡＤがかけ離れたものとなってしまい、符号化モードの選択を誤って画質が大幅に低下するという問題があった。
一方、インター予測の場合は、すでに符号化された別の画像を参照するため、マクロブロック単位で並列化処理を行う場合でも、対象画像中のどのブロックにおいても実行が可能であり、常に正確なＳＡＤを計算することができる。 However, in this case, a difference occurs between the original image and the decoded image of the reference pixel due to the influence of the quantization error, and particularly at a low bit rate, the SAD and the pseudo SAD are separated from each other. There was a problem that the image quality deteriorated significantly.
On the other hand, in the case of inter prediction, since another image that has already been encoded is referred to, even if parallel processing is performed in units of macroblocks, it can be executed in any block in the target image and is always accurate. SAD can be calculated.

そこで、このような状況下では、図１２で説明したように、常に正確なＳＡＤが計算できるようにしたインター予測の結果に基づいて判定を行うことができる本発明を利用することにより、正確なイントラ予測が行えなくても精度の高いイントラ／インター判定が可能となる。
この場合、対象マクロブロックがカテゴリ２に分類された場合の更なる判定方法については特に限定されないが、例えば以下のような方法を用いると効果的である。 Under such circumstances, as described with reference to FIG. 12, by using the present invention that can make a determination based on the result of inter prediction that allows accurate SAD to be always calculated, Even if intra prediction cannot be performed, highly accurate intra / inter determination is possible.
In this case, the further determination method when the target macroblock is classified into category 2 is not particularly limited, but for example, the following method is effective.

すなわち、まず、インター予測モードの中で代表的なモード(例えばブロックサイズの最も小さなモードやＳＡＤの最も小さなモード)のＳＡＤ(Inter ＳＡＤという)と、イントラ予測モードの中で代表的なモード(例えばブロックサイズの最も小さなモードやＳＡＤの最も小さなモード)の擬似ＳＡＤ(Intra pseudＳＡＤという)の差分の絶対値が予め設定してある所望の幅の閾値Ｔ3 未満であれば、量子化誤差の影響により明確なイントラ／インター判定が困難であるとして、一般的にイントラ予測よりも効果の高いインター予測を選択する。 That is, first, a SAD (Inter SAD) which is a representative mode (for example, a mode having the smallest block size or a mode having the smallest SAD) in the inter prediction mode and a typical mode (for example, the intra prediction mode) If the absolute value of the difference in pseudo SAD (referred to as intrapseud SAD) of the smallest block size mode or the smallest SAD mode is less than a preset threshold value T3 of the desired width, it is clear due to the influence of the quantization error. Since it is difficult to perform intra / inter determination, generally, inter prediction that is more effective than intra prediction is selected.

そして、上記以外の状況下では、イントラ予測とインター予測の性能的な優劣が十分判別可能であるとして、InterＳＡＤと Intra pseudＳＡＤを予測誤差として、上記した(数１)式などによるコスト計算を行い、コスト値の小さい方の予測手段を選択するのである。
このとき、上記の閾値Ｔ3 の値を、量子化パラメータの値に応じて変化させると特に高い効果を発揮できるが、レート制御などの要件を満たすために固定値を用いても十分な効果がある。 Then, under the circumstances other than the above, assuming that the superiority and inferiority of intra prediction and inter prediction can be sufficiently determined, cost calculation is performed using the above equation (1) using InterSAD and Intra pseudSAD as prediction errors, The prediction means with the smaller cost value is selected.
At this time, if the value of the threshold value T3 is changed according to the value of the quantization parameter, a particularly high effect can be exhibited. However, even if a fixed value is used to satisfy the requirements such as rate control, there is a sufficient effect. .

また、イントラ予測の予測誤差として擬似ＳＡＤを使わなくても、擬似イントラ予測画像を用いて計算したＳＡＴＤ(擬似ＳＡＴＤ)やＳＳＤ(擬似ＳＳＤ)を用いてコスト計算をしても良いし、特にコスト計算を行わずに予測誤差同士を比較して判定を行っても構わない。
また、このときのイントラ予測のコスト値としては、上記のような擬似イントラ予測誤差を使わなくても、例えば、対象ブロックの原画像を用いて計算した分散値や、対象ブロックの各画素(原画像)に対して微分フィルタを施した結果を用いても、これらを組み合わせた値を利用しても構わない。 Further, the cost may be calculated using SATD (pseudo-SATD) or SSD (pseudo-SSD) calculated using a pseudo-intra prediction image without using pseudo-SAD as a prediction error of intra prediction. The determination may be made by comparing prediction errors without performing calculation.
In addition, as the cost value of intra prediction at this time, for example, the variance value calculated using the original image of the target block or each pixel (original source) of the target block can be used without using the pseudo intra prediction error as described above. A result obtained by applying a differential filter to (image) may be used, or a value obtained by combining these may be used.

ここで微分フィルタを利用する場合、例えば図１４に示すように、対象ブロックの周辺画素(原画像)も使用すると効果的である。
このときの微分フィルタとしては、例えば図１５に示すソーベルフィルタや図１６に示すプレウィットフィルタを用いると効果的であり、これらのフィルタを利用して、例えば角度を変化させて対象ブロック内のエッジの強度を計算し、最も大きな値に基づいてコスト値を計算すると良い。 Here, when using a differential filter, it is effective to use peripheral pixels (original image) of the target block as shown in FIG. 14, for example.
As the differential filter at this time, it is effective to use, for example, the Sobel filter shown in FIG. 15 or the pre-witt filter shown in FIG. 16, and these filters are used to change the angle, for example, in the target block. It is preferable to calculate the strength of the edge and calculate the cost value based on the largest value.

次に、以上に説明したイントラ／インター判定方法を使用して符号化を行うようにした本発明の実施形態について、実施形態１として説明する。
図１は、実施形態１に係る動画像符号化装置１００を示したもので、これに原画像１０１が入力され、入力された原画像１０１の符号化ストリームが生成される。 Next, an embodiment of the present invention in which encoding is performed using the intra / inter determination method described above will be described as Embodiment 1.
FIG. 1 shows a moving image encoding apparatus 100 according to the first embodiment. An original image 101 is input to the moving image encoding device 100, and an encoded stream of the input original image 101 is generated.

このため、当該動画像符号化装置１００には、入力された原画像１０１を保持する入力画像メモリ１０２と、入力画像を小領域に分割するブロック分割部１０３、ブロック単位で擬似的な画面内予測を行う擬似画面内予測部１０５、動き探索部１０４により検出された動き量に基づきブロック単位で画面間予測を行う画面間予測部１０６、画像の性質に合った予測モード(予測手段およびブロックサイズ)を決定するモード選択部１０７、このモード選択部１０７の結果に応じて正確な画面内予測を行う画面内予測部１０８が備えられている。 For this reason, the moving picture encoding apparatus 100 includes an input image memory 102 that holds the input original image 101, a block dividing unit 103 that divides the input image into small regions, and pseudo intra prediction in units of blocks. The intra-prediction prediction unit 105 that performs the inter-screen prediction unit 106 that performs inter-screen prediction based on the amount of motion detected by the motion search unit 104, and a prediction mode (prediction means and block size) that matches the nature of the image. And a mode selection unit 107 that determines the in-screen prediction according to the result of the mode selection unit 107.

そして、更に、予測差分を生成するための減算部１０９、予測差分に対して符号化を行う周波数変換部１１０及び量子化処理部１１１、符号の発生確率に応じた適応的符号化を行うための可変長符号化部１１２、一度符号化した予測差分を復号化するための逆量子化処理部１１３及び逆周波数変換部１１４、復号化された予測差分を用いて復号化画像を生成するための加算部１１５、復号化画像を保持して後の予測に活用するための参照画像メモリ１１６が備えられている。 Further, a subtraction unit 109 for generating a prediction difference, a frequency conversion unit 110 and a quantization processing unit 111 that perform encoding on the prediction difference, and an adaptive encoding according to the occurrence probability of the code Variable length encoding unit 112, inverse quantization processing unit 113 and inverse frequency transform unit 114 for decoding the prediction difference once encoded, addition for generating a decoded image using the decoded prediction difference The unit 115 includes a reference image memory 116 for holding the decoded image and using it for subsequent prediction.

次に、この動画像符号化装置１００の動作について説明する。
いま、ここで動画像符号化装置１００に原画像１０１が入力されたとする。
そうすると、まず、入力画像メモリ１０２は、原画像１０１の中から一枚の画像を符号化対象画像として保持し、これをブロック分割部１０３により細かなブロックに分割し、動き探索部１０４、擬似画面内予測部１０５、画面間予測部１０６、画面内予測部１０８、それに減算部１０９に渡す。 Next, the operation of the moving picture coding apparatus 100 will be described.
Here, it is assumed that the original image 101 is input to the moving image encoding apparatus 100.
Then, first, the input image memory 102 holds one image from the original image 101 as an encoding target image, and divides the image into fine blocks by the block dividing unit 103, and the motion search unit 104, pseudo screen To the intra prediction unit 105, the inter-screen prediction unit 106, the intra-screen prediction unit 108, and the subtraction unit 109.

そこで、動き探索部１０４では、参照画像メモリ１１６に格納されている復号化済み画像を用いて該当ブロックの動き量を計算し、動きベクトルを画面間予測部１０６に渡す。
これにより擬似画面内予測部１０５と画面間予測部１０６は、それぞれ周辺ブロックの原画像を用いた擬似的な画面内予測処理と、符号化済みの別画像を参照した画面間予測処理とを複数のブロックサイズで実行し、モード選択部１０７において最適な予測モードを選択する。 Therefore, the motion search unit 104 calculates the motion amount of the corresponding block using the decoded image stored in the reference image memory 116, and passes the motion vector to the inter-screen prediction unit 106.
As a result, each of the pseudo intra-screen prediction unit 105 and the inter-screen prediction unit 106 performs a plurality of pseudo intra-screen prediction processes using the original images of neighboring blocks and an inter-screen prediction process referring to another encoded image. The mode selection unit 107 selects an optimal prediction mode.

そして、まず、モード選択結果がイントラモードである場合は、周辺マクロブロックの符号化が完了した時点で、画面内予測部１０８により、周辺ブロックの復号画像を用いた正確な画面内予測を該当モードに対して行い、予測画像を減算部１０９と加算部１１５に送る。
一方、モード選択結果がインターモードである場合は、画面間予測部１０６において既に作成済みの該当モードの予測画像を減算部１０９と加算部１１５に送る。 First, when the mode selection result is the intra mode, when the encoding of the surrounding macroblock is completed, the in-screen prediction unit 108 performs accurate in-screen prediction using the decoded image of the surrounding block in the corresponding mode. The prediction image is sent to the subtraction unit 109 and the addition unit 115.
On the other hand, when the mode selection result is the inter mode, the predicted image of the corresponding mode already created in the inter-screen prediction unit 106 is sent to the subtraction unit 109 and the addition unit 115.

そこで、減算部１０９では、対象ブロックの原画像とモード選択部１０７で選択されたモードにより作成された予測画像との差分(予測差分)が取られ、生成された予測差分が周波数変換部１１０に渡され、この結果、周波数変換部１１０と量子化処理部１１１で、送られてきた予測差分に対して指定された大きさのブロック単位でそれぞれＤＣＴなどの周波数変換と量子化処理が施され、可変長符号化部１１２と逆量子化部１１３に渡される。 Therefore, the subtraction unit 109 takes the difference (prediction difference) between the original image of the target block and the prediction image created by the mode selected by the mode selection unit 107, and the generated prediction difference is sent to the frequency conversion unit 110. As a result, in the frequency conversion unit 110 and the quantization processing unit 111, frequency conversion and quantization processing such as DCT are performed in units of blocks of a specified size with respect to the transmitted prediction difference, respectively. The data is passed to the variable length coding unit 112 and the inverse quantization unit 113.

この結果、まず、可変長符号化処理部１１２では、量子化済み周波数変換係数とヘッダ情報に対して記号の発生確率に基づいた符号化を行い、入力された原画像１０１の符号化ストリームを生成させ、これにより動画像符号化装置１００としての本来の動作が果たせることになる。 As a result, first, the variable-length encoding processing unit 112 performs encoding based on the probability of symbol generation for the quantized frequency transform coefficient and header information, and generates an encoded stream of the input original image 101 As a result, the original operation as the moving picture encoding apparatus 100 can be performed.

他方、逆量子化処理部１１３では、逆周波数変換部１１４と共に量子化後の周波数変換係数に対して、それぞれ逆量子化処理と逆ＤＣＴなどの逆周波数変換処理を施し、予測差分を取得して加算部１１５に送る。
これにより加算部１１５では、予測画像と復号化済み予測差分が加算されて復号化画像を生成し、参照画像メモリ１１６に復号化画像が格納される。 On the other hand, the inverse quantization processing unit 113 performs inverse frequency transformation processing such as inverse quantization processing and inverse DCT on the frequency transformation coefficient after quantization together with the inverse frequency transformation unit 114 to obtain a prediction difference. The data is sent to the adder 115.
As a result, the adding unit 115 adds the predicted image and the decoded prediction difference to generate a decoded image, and the decoded image is stored in the reference image memory 116.

次に、このときのモード選択部１０７の詳細について、図２により詳細に説明する。
このモード選択部１０７は、図示のように、イントラモードに対してコストを計算し最適な符号化モードを選択するイントラモード判定部２０１と、インターモードに対してコストを計算し最適な符号化モードを選択するインターモード判定部２０２、それに予測手段(イントラ予測若しくはインター予測)を決定するイントラ／インター判定部２０３とで構成されている。 Next, details of the mode selection unit 107 at this time will be described in detail with reference to FIG.
As shown in the figure, the mode selection unit 107 calculates the cost for the intra mode and selects an optimal encoding mode, and calculates the cost for the inter mode and the optimal encoding mode. And an intra / inter determination unit 203 that determines prediction means (intra prediction or inter prediction).

そして、イントラモード判定部２０１では、ブロック分割部１０３から送られてきた原画像と擬似画面内予測部１０５で計算されたイントラ擬似予測画像の双方からイントラ擬似ＳＡＤを計算し、コスト値を計算する。
一方、インターモード判定部２０２では、ブロック分割部１０３から送られてきた原画像と画面間予測部１０６により計算されたインター予測画像の双方からインターＳＡＤを計算し、コスト値を計算する。
そして、イントラ／インター判定部２０３では、各予測手段を用いた代表モードのイントラ擬似ＳＡＤ及びインターＳＡＤを用いて、対象マクロブロックに対する予測手段を選択するのである。 Then, the intra mode determination unit 201 calculates an intra pseudo SAD from both the original image sent from the block division unit 103 and the intra pseudo prediction image calculated by the intra-prediction screen prediction unit 105, and calculates a cost value. .
On the other hand, the inter mode determination unit 202 calculates an inter SAD from both the original image sent from the block division unit 103 and the inter prediction image calculated by the inter-screen prediction unit 106, and calculates a cost value.
Then, the intra / inter determination unit 203 selects a prediction unit for the target macroblock using the intra-pseudo SAD and the inter SAD in the representative mode using each prediction unit.

次に、このときのモード選択処理について、図１７のフローチャートにより説明する。
モード選択処理を開始すると、まず、全てのインターモードの画面間予測を実行し、各モードに対してコスト計算を行い(１７０１)、次にインターモードの中で最も小さなＳＡＤを与えるものをインター代表モードとし、そのＳＡＤを計算して InterＳＡＤとする(１７０２)。
次いでイントラモードに対しても全てのモードにおいて擬似予測を行い、コスト計算を行い(１７０３)、更に、擬似ＳＡＤが最小となるモードをイントラ代表モードとし、その擬似ＳＡＤを IntraＳＡＤとする(１７０４)。 Next, the mode selection process at this time will be described with reference to the flowchart of FIG.
When the mode selection process is started, first, inter prediction between all modes is executed, cost calculation is performed for each mode (1701), and the one that gives the smallest SAD among the inter modes is inter representative. Mode, and the SAD is calculated and set to Inter SAD (1702).
Next, pseudo prediction is performed in all modes for the intra mode, cost calculation is performed (1703), and the mode in which the pseudo SAD is minimized is set as the intra representative mode, and the pseudo SAD is set as Intra SAD (1704).

次に、いま計算した InterＳＡＤを閾値Ｔ1 と比較し、InterＳＡＤの値が閾値Ｔ1 未満であるか否かを判定する(１７０５)。
そして、まず、判定結果がＹＥＳ、つまり、
InterＳＡＤ＜Ｔ1
のときは、図１２の「ケース１」、つまりカテゴリ１に該当するので、予測手段としてインター予測を選択することとし、インターモードの中でコスト値が最小となるモードを最適な符号化モードとして選択する(１７０６)。 Next, the currently calculated InterSAD is compared with the threshold value T1, and it is determined whether or not the value of the InterSAD is less than the threshold value T1 (1705).
First, the determination result is YES, that is,
InterSAD <T1
In this case, since it corresponds to “Case 1” in FIG. 12, that is, category 1, the inter prediction is selected as the prediction means, and the mode having the smallest cost value in the inter mode is set as the optimum encoding mode. Select (1706).

しかして判定結果がＮＯのときは処理(１７０７)に進み、今度は、InterＳＡＤを閾値Ｔ2 と比較し、InterＳＡＤの値が閾値Ｔ2 以上であるか否かを判定する。
そして、まず、判定結果がＹＥＳ、つまり、
InterＳＡＤ≧Ｔ2
のときは、図１２の「ケース３」、つまりカテゴリ３に該当するので、予測手段としてイントラ予測を選択することとし、イントラモードの中でコスト値が最小となるモードを最適な符号化モードとして選択する(１７０８)。 If the determination result is NO, the process proceeds to processing (1707). This time, InterSAD is compared with the threshold value T2, and it is determined whether or not the value of InterSAD is equal to or greater than the threshold value T2.
First, the determination result is YES, that is,
InterSAD ≧ T2
In this case, since it corresponds to “Case 3” in FIG. 12, that is, category 3, intra prediction is selected as the prediction means, and the mode having the smallest cost value in the intra mode is set as the optimum encoding mode. Select (1708).

しかして、判定結果がＮＯの場合は図１２の「ケース２」、つまりカテゴリ２に該当し、従って、さらに詳細なイントラ／インター判定を行うこととし、このため判定処理(１７０９)に進み、ここで、InterＳＡＤと IntraＳＡＤの差分の絶対値が予め設定してある閾値Ｔ3 未満であるか否かを判定する。
ここで、この閾値Ｔ3 とは、図１２に示されているように、InterＳＡＤ(８０１)に対するイントラ予測特性８０３とインター予測特性８０４が交差して等しくなった点Ｘを中心として予め任意に設定した所望の範囲のことである。 Therefore, if the determination result is NO, it corresponds to “Case 2” of FIG. 12, that is, category 2, and therefore, more detailed intra / inter determination is performed, and therefore the process proceeds to determination processing (1709). Thus, it is determined whether or not the absolute value of the difference between Inter SAD and Intra SAD is less than a preset threshold value T3.
Here, as shown in FIG. 12, this threshold value T3 is arbitrarily set in advance around a point X at which the intra prediction characteristic 803 and the inter prediction characteristic 804 for the Inter SAD (801) intersect and become equal. This is the desired range.

そして、まず、判定結果がＹＥＳ、つまり、
｜InterＳＡＤ− IntraＳＡＤ｜＜Ｔ3
のときは、インターモードの中でコスト値が最小となるモードを選択する(１７１０)。これは、上記したように、一般的にインター予測の方がイントラ予測よりも効果が高いとされているからである。
一方、判定結果がＮＯの場合は、全ての符号化モードに対してイントラ擬似ＳＡＤとインターＳＡＤを用いたコスト値計算を行い、コスト値が最も小さくなるモードを最適な符号化モードとして選択するのである(１７１１)。 First, the determination result is YES, that is,
｜ InterSAD-IntraSAD | <T3
In the case of, the mode with the smallest cost value is selected from the inter modes (1710). This is because, as described above, inter prediction is generally more effective than intra prediction.
On the other hand, if the determination result is NO, the cost value calculation using intra pseudo SAD and inter SAD is performed for all coding modes, and the mode with the smallest cost value is selected as the optimum coding mode. Yes (1711).

そして、以上のように、処理(１７０６)、処理(１７０８)、処理(１７１０)、それに処理(１７１１)のいずれかを終了すれば、ここでモード選択処理終了(１７１２)となり、モード選択部１０７による１マクロブロック分のモード選択処理が完了し、入力された原画像１０１の符号化ストリームの生成が、最適な予測モードの選択のもとで得られるようになり、動画像符号化装置１００としての動作が果されることになる。 As described above, when any one of the processing (1706), the processing (1708), the processing (1710), and the processing (1711) is completed, the mode selection processing ends (1712). When the mode selection process for one macroblock is completed, the generation of the encoded stream of the input original image 101 can be obtained under the selection of the optimal prediction mode. Will be performed.

ところで、以上に説明した実施形態１では、図１７に示すように、イントラ／インター判定(１７０５)の前に、全てのモードによる予測を実行している(１７０１)〜(１７０３)。
しかし、ここで、特にマクロブロック単位で並列処理を行うソフトウェアエンコーダなどを対象とした場合は、イントラ／インター判定の前にはインター代表モードの予測のみを行い、その後、処理(１７０５)〜(１７０７)などの判定結果に応じて、その他のモードにより予測画像を生成するようにすると、不必要な予測処理を省略することができ、より一層、計算量が削減できる。 By the way, in Embodiment 1 demonstrated above, as shown in FIG. 17, before the intra / inter determination (1705), prediction by all the modes is performed (1701)-(1703).
However, here, particularly when a software encoder or the like that performs parallel processing in units of macroblocks is targeted, only prediction of the inter representative mode is performed before intra / inter determination, and thereafter, processing (1705) to (1707) is performed. If a predicted image is generated in another mode according to a determination result such as), unnecessary prediction processing can be omitted, and the amount of calculation can be further reduced.

この場合、インター予測モードを決定するため、処理(１７０６)と処理(１７１０)において代表モード以外のインターモードで予測を実行し、一方でイントラ予測モードを決定するために処理(１７０８)により全てのイントラモードで擬似予測を実行すればよい。
また、イントラ代表モードを用いた判定(１７０９)では、イントラ代表モードのみで擬似予測を行えばよい。 In this case, in order to determine the inter prediction mode, prediction is executed in the inter mode other than the representative mode in the processing (1706) and the processing (1710), while all processing is performed by the processing (1708) to determine the intra prediction mode. The pseudo prediction may be executed in the intra mode.
In the determination using the intra representative mode (1709), the pseudo prediction may be performed only in the intra representative mode.

次に、本発明の実施形態２について説明する。
上記した実施形態１は、並列処理などにより周辺マクロブロックの符号化処理が未完了であるため、対象マクロブロックのモード選択時にイントラ予測が行えない状況下での適用を考慮したものであるが、以下に説明する実施形態２では、逐次的に符号化処理を行う場合など、モード選択時には既に周辺マクロブロックの符号化が完了していて、常にイントラ予測が正しく行える場合を対象としている。
従って、この実施形態２は、特にソフトウェアによりエンコーダを実現する場合に高い効果を発揮する。 Next, Embodiment 2 of the present invention will be described.
The above-described first embodiment considers application in a situation where intra prediction cannot be performed at the time of mode selection of the target macroblock because the encoding processing of neighboring macroblocks is incomplete due to parallel processing or the like. The second embodiment described below is intended for cases where encoding of neighboring macroblocks has already been completed at the time of mode selection, such as when encoding processing is performed sequentially, and intra prediction can always be performed correctly.
Therefore, the second embodiment is highly effective particularly when an encoder is realized by software.

ここで、まず、図１８(a)、(b)は、一般的なエンコーダ、例えば上記した従来技術などの一般的手法によるエンコーダと、本発明に係るエンコーダによるモード選択処理の流れについて、それぞれ概念的に示したものである。
そして、まず、一般的なエンコーダの場合、図１８(a)に示すように、各マクロブロックに対して、一旦、全ての符号化モードにより予測処理を実行し、上記した(数１)式などによりコスト計算を行ってコスト最小のモードを選択するようにしている。
しかしながら、この場合、上記したように、モード選択時の予測処理に極めて多くの計算量が必要になる。 Here, first, FIGS. 18 (a) and 18 (b) are diagrams respectively showing the flow of mode selection processing by a general encoder, for example, an encoder by a general method such as the above-described conventional technique, and the encoder according to the present invention. It is shown as an example.
First, in the case of a general encoder, as shown in FIG. 18 (a), prediction processing is executed once for all macroblocks in all encoding modes, and the above-described equation (1) is used. The cost is calculated by selecting the mode with the lowest cost.
However, in this case, as described above, a very large amount of calculation is required for the prediction process when the mode is selected.

一方、本発明の場合、既に図１２により説明したイントラ／インター判定を最初に行うことにより、モードの絞込みを行って予測画像の生成回数を低減し、処理量を削減することができる。
すなわち、本発明の実施例においては、図１８(b)に示すように、まず、インター代表モード(例えばブロックサイズ８×８のインターモード)において予測を行い(１８０２)、その際のＳＡＤ値によって候補を絞ることができる。
そして、まず、ＳＡＤ値が閾値Ｔ1 よりも小さい場合は、インターモードの中からモードを選択する(ケース１)。 On the other hand, in the case of the present invention, by first performing the intra / inter determination already described with reference to FIG. 12, it is possible to narrow down the mode and reduce the number of generations of predicted images, thereby reducing the processing amount.
That is, in the embodiment of the present invention, as shown in FIG. 18B, first, prediction is performed in an inter representative mode (for example, an inter mode having a block size of 8 × 8) (1802), and the SAD value at that time is used. Can narrow down candidates.
First, when the SAD value is smaller than the threshold value T1, a mode is selected from the inter modes (Case 1).

従って、本発明においては、インターモードを用いた予測画像生成過程を省略することができる。
次に、ＳＡＤ値が閾値Ｔ2 以上であった場合、イントラモードの中からモードを選択する(ケース３)。
従って、この実施例においては、代表モードを除くインターモードを用いた予測画像生成過程を省略することができる。
なお、ＳＡＤ値が上記以外の場合は、例えば一般的手法と同様、全ての符号化モードにより予測処理を行い、コストが最小になるモードを選択すれば良い(ケース２)。 Therefore, in the present invention, the predicted image generation process using the inter mode can be omitted.
Next, when the SAD value is equal to or greater than the threshold value T2, a mode is selected from the intra modes (Case 3).
Therefore, in this embodiment, the predicted image generation process using the inter mode excluding the representative mode can be omitted.
When the SAD value is other than the above, for example, as in the general method, prediction processing is performed in all encoding modes, and a mode that minimizes the cost may be selected (case 2).

図３は、本発明の実施形態２に係る動画像符号化装置３００を示したもので、これに原画像３０１が入力されると、入力された原画像３０１の符号化ストリーム３１１が生成されるようになっている。
このため、当該動画像符号化装置３００には、入力された原画像３０１を保持する入力画像メモリ３０２と、入力画像を小領域に分割するブロック分割部３０３、ブロック単位で画面内予測を行う画面内予測部３０５、動き探索部３０４により検出された動き量に基づきブロック単位で画面間予測を行う画面間予測部３０６が備えられている。 FIG. 3 shows a moving image encoding apparatus 300 according to Embodiment 2 of the present invention. When an original image 301 is input to this apparatus, an encoded stream 311 of the input original image 301 is generated. It is like that.
For this reason, the moving image encoding apparatus 300 includes an input image memory 302 that holds the input original image 301, a block dividing unit 303 that divides the input image into small regions, and a screen that performs intra-screen prediction in units of blocks. An inter-screen prediction unit 306 that performs inter-screen prediction on a block basis based on the motion amount detected by the inner prediction unit 305 and the motion search unit 304 is provided.

そして、更に、画像の性質に合った予測モード(予測手段およびブロックサイズ)を決定するモード選択部３０７、予測差分を生成するための減算部３０８、予測差分に対して符号化を行う周波数変換部３０９及び量子化処理部３１０、符号の発生確率に応じた適応的符号化を行うための可変長符号化部３１１、一度符号化した予測差分を復号化するための逆量子化処理部３１２及び逆周波数変換部３１３、復号化された予測差分を用いて復号化画像を生成するための加算部３１４、それに復号化画像を保持して後の予測に活用するための参照画像メモリ３１５が備えられている。 Further, a mode selection unit 307 that determines a prediction mode (prediction means and block size) that matches the nature of the image, a subtraction unit 308 for generating a prediction difference, and a frequency conversion unit that encodes the prediction difference 309 and the quantization processing unit 310, a variable length coding unit 311 for performing adaptive coding according to the occurrence probability of the code, an inverse quantization processing unit 312 for decoding the prediction difference once coded, and the inverse A frequency conversion unit 313, an addition unit 314 for generating a decoded image using the decoded prediction difference, and a reference image memory 315 for holding the decoded image and using it for later prediction are provided. Yes.

次に、この動画像符号化装置３００の動作について説明する。
いま、ここで動画像符号化装置３００に原画像３０１が入力されたとする。
そうすると、まず、入力画像メモリ３０２は、原画像３０１の中から一枚の画像を符号化対象画像として保持し、これをブロック分割部３０３により細かなブロックに分割し、動き探索部３０４、画面内予測部３０５、及び画面間予測部３０６に渡す。
そこで、まず、動き探索部３０４では、参照画像メモリ３１５に格納されている復号化済み画像を用いて該当ブロックの動き量を計算し、動きベクトルを画面間予測部３０６に渡す。 Next, the operation of the moving picture coding apparatus 300 will be described.
Here, it is assumed that the original image 301 is input to the moving image encoding apparatus 300.
Then, first, the input image memory 302 holds one image from the original image 301 as an encoding target image, and divides the image into fine blocks by the block dividing unit 303, and the motion search unit 304, The data is passed to the prediction unit 305 and the inter-screen prediction unit 306.
Therefore, first, the motion search unit 304 calculates the motion amount of the corresponding block using the decoded image stored in the reference image memory 315, and passes the motion vector to the inter-screen prediction unit 306.

このとき画面内予測部３０５と画面間予測部３０６では、夫々符号化済みの周辺ブロックを参照した画面内予測処理と符号化済みの別画像を参照した画面間予測処理とを複数のブロックサイズで実行する。
そして、モード選択部３０７により最適な予測モードを選択し、選択されたモードで作成した予測画像を減算部３０８と加算部３１４に送る。
そこで、減算部３０８では、対象ブロックの原画像と、モード選択部３０７により選択されたモードで作成された予測画像の差分(予測差分)を生成し、周波数変換部３０９に渡す。 At this time, the intra-screen prediction unit 305 and the inter-screen prediction unit 306 each perform intra-screen prediction processing with reference to the encoded peripheral blocks and inter-screen prediction processing with reference to another encoded image with a plurality of block sizes. Execute.
Then, an optimal prediction mode is selected by the mode selection unit 307, and a prediction image created in the selected mode is sent to the subtraction unit 308 and the addition unit 314.
Therefore, the subtraction unit 308 generates a difference (prediction difference) between the original image of the target block and the prediction image created in the mode selected by the mode selection unit 307 and passes it to the frequency conversion unit 309.

これら周波数変換部３０９と量子化処理部３１０では、送られてきた予測差分に対して指定された大きさのブロック単位で夫々ＤＣＴなどの周波数変換及び量子化処理を行い、可変長符号化部３１１と逆量子化部３１２に渡す。
そこで、まず、可変長符号化処理部３１１では、量子化済み周波数変換係数とヘッダ情報を、記号の発生確率に基づいて符号化し、これにより符号化ストリーム３１１を生成する。 The frequency conversion unit 309 and the quantization processing unit 310 perform frequency conversion and quantization processing such as DCT on a block basis having a designated size for the transmitted prediction difference, and the variable length encoding unit 311. To the inverse quantization unit 312.
Therefore, first, the variable length encoding processing unit 311 encodes the quantized frequency transform coefficient and header information based on the occurrence probability of the symbol, thereby generating an encoded stream 311.

また、逆量子化処理部３１２と逆周波数変換部３１３では、量子化後の周波数変換係数に対して夫々逆量子化と逆ＤＣＴなどの逆周波数変換を施し、予測差分を取得して加算部３１４に送る。
そこで、加算部３１４では、予測画像と復号化済み予測差分を加算して復号化画像を生成し、参照画像メモリ３１５に格納するのである。 In addition, the inverse quantization processing unit 312 and the inverse frequency transform unit 313 perform inverse frequency transform such as inverse quantization and inverse DCT on the quantized frequency transform coefficient, respectively, obtain a prediction difference, and add the addition unit 314. Send to.
Therefore, the adding unit 314 generates a decoded image by adding the predicted image and the decoded prediction difference, and stores the decoded image in the reference image memory 315.

次に、この動画像符号化装置３００におけるモード選択部３０７の詳細について、図４により説明する。
このモード選択部３０７は、予測手段を決定するイントラ／インター判定部４０１と、イントラモードに対してコストを計算し最適な符号化モードを選択するイントラモード判定部４０２、インターモードに対してコストを計算し最適な符号化モードを選択するインターモード判定部４０３とで構成されている。 Next, details of the mode selection unit 307 in the moving picture coding apparatus 300 will be described with reference to FIG.
The mode selection unit 307 includes an intra / inter determination unit 401 that determines a prediction unit, an intra mode determination unit 402 that calculates a cost for the intra mode and selects an optimal encoding mode, and a cost for the inter mode. An inter-mode determination unit 403 that calculates and selects an optimal encoding mode.

このとき、まず、イントラ／インター判定部４０１では、画面間予測部３０６により作成されたインター代表モードの予測画像と、ブロック分割部３０３から送られてきた原画像から代表モードのＳＡＤを計算し、続いて、このＳＡＤの値に応じて、画面内予測部３０５及び画面間予測部３０６において次の判定に必要な符号化モードの予測画像を生成する。
そして、この符号化モードの予測画像がイントラモード判定部４０２とインターモード判定部４０３に送られ、最終的なモードが決定されることになる。 At this time, first, the intra / inter determination unit 401 calculates the SAD of the representative mode from the predicted image of the inter representative mode created by the inter-screen prediction unit 306 and the original image sent from the block dividing unit 303, Subsequently, in accordance with the SAD value, the intra-screen prediction unit 305 and the inter-screen prediction unit 306 generate a prediction image in an encoding mode necessary for the next determination.
Then, the prediction image of this encoding mode is sent to the intra mode determination unit 402 and the inter mode determination unit 403, and the final mode is determined.

次に、このときのモード選択部３０７によるモード選択処理の手順について図１９により説明する。
このモード選択処理が開始されたら、まず、インターモードの代表モードにおいて画面間予測を実行し(１９０１)、このモードでのＳＡＤを計算して InterＳＡＤとする(１９０２)。
次いで、このときの InterＳＡＤの値が閾値Ｔ1 よりも小さいか否かを調べる(１９０３)。
そして、結果がＹＥＳのときは、予測手段としてインター予測を選択することとし、まず、代表モード以外のインターモードにおいて画面間予測を実行した後(１９０４)、その中でコスト最小のモードを選択するのである(１９０５)。 Next, the procedure of mode selection processing by the mode selection unit 307 at this time will be described with reference to FIG.
When this mode selection process is started, first, inter-screen prediction is executed in the representative mode of the inter mode (1901), and the SAD in this mode is calculated to be Inter SAD (1902).
Next, it is checked whether or not the InterSAD value at this time is smaller than the threshold value T1 (1903).
When the result is YES, inter prediction is selected as a prediction means. First, after performing inter-screen prediction in an inter mode other than the representative mode (1904), the mode with the lowest cost is selected. (1905).

しかして、判定処理(１９０３)での結果がＮＯ、つまり InterＳＡＤの値が閾値Ｔ1 以上のときは、次に、このときの InterＳＡＤの値が閾値Ｔ2 以上あるか否かを調べる(１９０６)。
そして、結果がＹＥＳのときは、予測手段としてイントラ予測を選択することとし、全てのイントラモードにおいて画面内予測を実行し(１９０７)、次いで、その中でコスト最小のモードを選択するのである(１９０８)。
しかして、判定処理(１９０６)の結果がＮＯ、つまり InterＳＡＤの値が閾値Ｔ1 以上且つ閾値Ｔ2 未満である場合は、代表モード以外のインターモードにおいて予測を実行し(１９０９)、次いで、全てのイントラモードにおいて予測を実行し(１９１０)、この後、全てのモードの中で最もコスト値が小さいモードを選択するのである(１９１１)。 If the result of the determination process (1903) is NO, that is, if the value of InterSAD is greater than or equal to the threshold value T1, then it is checked whether or not the value of InterSAD at this time is greater than or equal to the threshold value T2 (1906).
When the result is YES, intra prediction is selected as the prediction means, intra prediction is executed in all intra modes (1907), and then the mode with the lowest cost is selected among them (1907). 1908).
Therefore, if the result of the determination process (1906) is NO, that is, if the value of InterSAD is not less than the threshold value T1 and less than the threshold value T2, prediction is executed in the inter mode other than the representative mode (1909), and then all intra The prediction is executed in the mode (1910), and then the mode having the smallest cost value among all the modes is selected (1911).

そして、以上のように、処理(１９０５)、処理(１９０８)、それに処理(１９１１)のいずれかが終了すれば、ここでモード選択処理終了(１９１２)となり、モード選択部３０７による１マクロブロック分のモード選択処理が完了し、入力された原画像３０１の符号化ストリームの生成が、最適な予測モードの選択のもとで得られるようになり、動画像符号化装置３００に本来の性能を発揮させることができる。 As described above, when one of the processing (1905), processing (1908), and processing (1911) is completed, the mode selection processing ends (1912). Mode selection processing is completed, and the generation of the encoded stream of the input original image 301 can be obtained under the selection of the optimal prediction mode, and the moving image encoding apparatus 300 exhibits its original performance. Can be made.

ここで、本発明の場合、イントラ／インター判定部だけを対象としているため、イントラ予測モード内とインター予測モード内におけるモード判定の方法については特に問わない。
従って、上記実施形態のように、全てのモードで予測を実行してコスト計算を行っても良いし、例えば、エッジの方向性を考慮するなど別の方法で判定を行っても構わない。
このとき何らかの方法でモードの絞込みを実施して計算量を節約すると、更に効果的である。 Here, in the present invention, since only the intra / inter determination unit is targeted, the mode determination method in the intra prediction mode and the inter prediction mode is not particularly limited.
Accordingly, as in the above-described embodiment, prediction may be performed in all modes to perform cost calculation, or determination may be performed by another method, for example, considering the directionality of the edge.
At this time, it is more effective to save the calculation amount by narrowing down the mode by some method.

また、上記実施形態では、ＳＡＤが最も小さくなるモードを代表モードとして設定しているが、ブロックサイズの最も小さなモードや逆にブロックサイズの最も大きなモード及び複数モードの組み合わせなど、その選び方は様々であり、代表モードの選択方法は問わない。
同じく、上記実施形態では、インター代表モードのＳＡＤによりマクロブロックを３種類あるパターンの中の１種に分類しているが、ＳＡＴＤやＳＳＤ、動きベクトル、ヘッダ符号量など、インター代表モードにて予測を行うことにより取得できる特徴量ならどのような値を用いて分類しても良いし、このとき分類するパターンの個数も問わない。 In the above embodiment, the mode with the smallest SAD is set as the representative mode. However, there are various ways to select the mode, such as the mode with the smallest block size and the combination of the mode with the largest block size and a plurality of modes. Yes, the selection method of the representative mode does not matter.
Similarly, in the above embodiment, macroblocks are classified into one of three types of patterns by SAD in inter-representation mode, but prediction is performed in inter-representation mode, such as SATD, SSD, motion vector, and header code amount. Any value may be used as long as it is a feature quantity that can be acquired by performing the above, and the number of patterns to be classified does not matter.

また、上記実施形態では、予測及び周波数変換をブロック単位で行っているが、これ以外にも、例えば画像の背景から分離したオブジェクト単位で算出しても良い。
同様に、周波数変換についても、一例としてＤＣＴを挙げているが、ＤＳＴ(Discrete Sine Transformation：離散サイン変換)、ＷＴ(Wavelet Transformation：ウェーブレット変換)、ＤＦＴ(Discrete Fourier Transformation：離散フーリエ変換)、ＫＬＴ(Karhunen-Loeve Transformation：カルーネン-レーブ変換)など、画素間相関除去に利用する直交変換ならどのようなものでも構わない。 In the above-described embodiment, prediction and frequency conversion are performed in units of blocks. However, other than this, for example, calculation may be performed in units of objects separated from the background of the image.
Similarly, DCT is cited as an example for frequency transformation, but DST (Discrete Sine Transformation), WT (Wavelet Transformation), DFT (Discrete Fourier Transformation), KLT ( Any orthogonal transformation used for removing correlation between pixels, such as Karhunen-Loeve Transformation, may be used.

このときイントラモードでは、特に画面内予測を行わなくても良く、ＭＰＥＧ−１やＭＰＥＧ−２のイントラ符号化のように、原画像に対して直接周波数変換を施しても構わないし、可変長符号化も特に行わなくて良い。
また、本発明はＨ．２６１、ＭＰＥＧ−１、Ｈ．２６２／ＭＰＥＧ−２、ＭＰＥＧ−４、Ｈ．２６３、Ｈ．２６４／ＡＶＣだけでなく、今後、策定されるであろう次世代標準など、どのような動画像符号化方式にも適用可能である。 At this time, in the intra mode, it is not necessary to perform intra prediction, and the original image may be directly subjected to frequency conversion as in the case of MPEG-1 or MPEG-2 intra coding. There is no need to make it special.
Further, the present invention relates to H.264. 261, MPEG-1, H.264. 262 / MPEG-2, MPEG-4, H.264. 263, H.M. The present invention is applicable not only to H.264 / AVC but also to any moving picture coding system such as a next generation standard that will be established in the future.

１００動画像符号化装置(実施形態１に係る動画像符号化装置)
３００動画像符号化装置(実施形態２に係る動画像符号化装置) 100 moving picture coding apparatus (moving picture coding apparatus according to Embodiment 1)
300 Moving Image Encoding Device (Moving Image Encoding Device According to Embodiment 2)

Claims

In a moving image encoding apparatus that predicts an encoding target image and encodes a difference between the predicted encoding target image and an original image,
Inter-screen prediction means for performing prediction of the encoding target image with reference to an image different from the encoding target image;
In-screen prediction means for performing prediction of the encoding target image with reference to the same image as the encoding target image;
A mode selection unit for determining which of the inter-screen prediction unit and the intra-screen prediction unit to use a prediction result when encoding the difference;
The moving picture encoding apparatus, wherein the mode selection means performs the determination based on a threshold set as a parameter value at the time of inter-screen prediction.

The moving image encoding device according to claim 1,
The mode selection means selects the inter-screen prediction means in a region where the prediction error of inter-screen prediction is small, and selects the intra-screen prediction means in a region where the prediction error of inter-screen prediction is large. Encoding device.

In a moving image encoding method of a method of encoding a difference between a predicted encoding target image and an original image,
When encoding the difference between the predicted encoding target image and the original image,
The prediction of the encoding target image is performed with reference to an image different from the encoding target image, and the prediction of the encoding target image is performed with reference to the same image as the encoding target image. With the case
A moving picture encoding method, wherein selection is made based on a threshold value set as a parameter value at the time of inter-screen prediction.