JP2009194474A

JP2009194474A - Video encoding device

Info

Publication number: JP2009194474A
Application number: JP2008030955A
Authority: JP
Inventors: Kaoru Matsuoka; 薫松岡; Tomoya Kodama; 知也児玉
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2008-02-12
Filing date: 2008-02-12
Publication date: 2009-08-27

Abstract

【課題】
本発明は、ダイレクトモードの性能を考慮し、動きの複雑度と相関のある閾値の設定を可能とする動画像符号化装置を提供する。
【解決手段】
評価値計算部２０２は、符号化対象画像のブロックのダイレクトモードによる動き予測残差の絶対値和を算出する。既に符号処理がなされた参照フレームとの予測残差の絶対値和をダイレクトモードの評価値として計算する。閾値設定部２０４は、参照フレームのブロックの予測符号量の平均値から、符号化対象画像の動き予測方式を決定する閾値を計算する。比較部２０５は、評価値計算部２０２が算出した評価値と、閾値設定部２０４が算出した閾値の比較を行う。決定部２０６は、比較部２０５の結果から符号化対象画像の動き予測方式としてダイレクトモードを選択するか否かを決定する。評価値が閾値より小さい場合は、ダイレクトモード選択を設定する。

【選択図】図２【Task】
The present invention provides a moving picture coding apparatus that can set a threshold value that correlates with the complexity of motion in consideration of direct mode performance.
[Solution]
The evaluation value calculation unit 202 calculates the absolute value sum of the motion prediction residuals in the direct mode of the block of the encoding target image. An absolute value sum of prediction residuals with reference frames that have already been subjected to coding processing is calculated as an evaluation value in the direct mode. The threshold value setting unit 204 calculates a threshold value for determining the motion prediction method of the encoding target image from the average value of the prediction code amount of the block of the reference frame. The comparison unit 205 compares the evaluation value calculated by the evaluation value calculation unit 202 with the threshold calculated by the threshold setting unit 204. The determination unit 206 determines whether or not to select the direct mode as the motion prediction method of the encoding target image from the result of the comparison unit 205. When the evaluation value is smaller than the threshold value, direct mode selection is set.

[Selection] Figure 2

Description

本発明は、動画像符号化装置に関する。 The present invention relates to a moving image encoding apparatus.

符号化対象のフレームが双予測ピクチャである場合、複数の参照画像から動き予測が行われ、符号化される。動き予測モードは複数あり、それらのモードの中から最適な方式を選択する。しかし、全ての方式で発生する符号量を算出し、その結果から１つの方式を決定すると、採用された方式以外の動き予測方式についての計算は無駄となってしまう。 When the encoding target frame is a bi-predictive picture, motion prediction is performed from a plurality of reference images and encoded. There are a plurality of motion prediction modes, and an optimum method is selected from these modes. However, if the amount of codes generated in all methods is calculated and one method is determined from the result, calculation for motion prediction methods other than the adopted method is wasted.

符号化対象ブロックの量子化パラメータの関数と、空間ダイレクトモードの動き予測残差に基づく評価値との比較により動き予測モードを早期判定する技術が提案されている（例えば非特許文献１）。
Xiaoan Lu, “Fast Mode Decision and Motion Estimation for H.264 with a focus on Mpeg-2/H.264 transcoding,” ISCAS, 2005 ITU-T Recommendation H.264, “Advanced Video Coding for generic audiovisual services.”(ISO/IEC 14496-10 | ITU-T Rec. H.264) There has been proposed a technique for early determination of a motion prediction mode by comparing a function of a quantization parameter of an encoding target block with an evaluation value based on a motion prediction residual in a spatial direct mode (for example, Non-Patent Document 1).
Xiaoan Lu, “Fast Mode Decision and Motion Estimation for H.264 with a focus on Mpeg-2 / H.264 transcoding,” ISCAS, 2005 ITU-T Recommendation H.264, “Advanced Video Coding for generic audiovisual services.” (ISO / IEC 14496-10 | ITU-T Rec. H.264)

上記の方法では、符号化対象画像の動きの複雑度を反映しない閾値によって早期判定を行っているため、適切な動き予測方式を判定できないという問題がある。空間ダイレクトモードは、周辺の動きベクトルから動きを予測するため、符号化対象画像の動きが隣接ブロック間で共通していれば、精度の高い動きベクトルを算出することが出来る。しかし、動きの方向及び大きさがブロック毎に異なる場合は、空間ダイレクトモードで算出される動きベクトルの精度は低い。つまり、符号化対象画像の動きの複雑度によりダイレクトモードの動き予測残差に基づく評価値が異なる。そのため、動きの複雑度を反映しない閾値では、ダイレクトモードの早期判定率が大きく変動してしまうという問題があった。 In the above method, there is a problem that an appropriate motion prediction method cannot be determined because early determination is performed based on a threshold value that does not reflect the complexity of the motion of the encoding target image. Since the spatial direct mode predicts motion from surrounding motion vectors, if the motion of the encoding target image is common between adjacent blocks, a highly accurate motion vector can be calculated. However, when the direction and magnitude of the motion are different for each block, the accuracy of the motion vector calculated in the spatial direct mode is low. That is, the evaluation value based on the motion prediction residual in the direct mode varies depending on the motion complexity of the encoding target image. For this reason, there is a problem that the early determination rate in the direct mode greatly fluctuates at a threshold value that does not reflect the complexity of motion.

本発明は、上記課題を解決するためになされたものであって、動きの複雑度を反映した閾値に基づきダイレクトモードの早期判定をする動画像符号化装置を提供することを目的とする。 The present invention has been made to solve the above-described problem, and an object of the present invention is to provide a moving picture coding apparatus that performs early determination of the direct mode based on a threshold value that reflects the complexity of motion.

上記目的を達成するために、複数の動き補償方式から符号化対象フレームの符号化対象ブロック毎に動き補償モードを選択して画像情報の符号化を行う画像情報の画像符号化装置において、前記符号化対象フレームの参照フレームのブロックごとの動きベクトルの符号量の符号量平均値を算出する第１の算出部と、前記第１の算出部が算出した前記符号量平均値に基づく閾値を前記参照フレームごとに算出する第２の算出部と、前記符号化対象ブロックに隣接する複数の符号化済みブロックの動きベクトルを用いる第１の動き予測方式に基づき符号化対象ブロックの動きベクトルを求める動きベクトル計算部と、前記符号化対象ブロックの前記動きベクトル計算部で求められた動きベクトルの動き予測残差に基づく評価値を算出する第３の算出部と、前記閾値と前記評価値とを比較する比較部と、前記評価値が前記閾値より小さい場合に、前記符号化対象ブロックの動きベクトルを前記動き予測方式の動きベクトルと決定する決定部と、前記評価値が前記閾値以上である場合に前記第１の動き予測方式とは異なる第２の動き予測方式に基づき前記符号化対象ブロックの動き予測を行う動き予測計算部と、求められた動き予測方式に基づき符号化対象ブロックを符号化する符号化部とを具備することを特徴とする動画像符号化装置を提供する。
In order to achieve the above object, in an image encoding apparatus for image information that performs encoding of image information by selecting a motion compensation mode for each encoding target block of an encoding target frame from a plurality of motion compensation schemes, A first calculation unit that calculates a code amount average value of a code amount of a motion vector for each block of the reference frame of the reference frame, and a threshold value based on the code amount average value calculated by the first calculation unit A motion vector for obtaining a motion vector of a coding target block based on a first motion prediction method using a second calculation unit that calculates for each frame and a motion vector of a plurality of coded blocks adjacent to the coding target block And a third calculation for calculating an evaluation value based on a motion prediction residual of the motion vector obtained by the calculation unit and the motion vector calculation unit of the encoding target block A comparison unit that compares the threshold value and the evaluation value, and a determination unit that determines a motion vector of the coding target block as a motion vector of the motion prediction method when the evaluation value is smaller than the threshold value. A motion prediction calculation unit that performs motion prediction of the coding target block based on a second motion prediction method different from the first motion prediction method when the evaluation value is equal to or greater than the threshold; There is provided a moving image encoding apparatus comprising: an encoding unit that encodes an encoding target block based on a prediction method.

本発明によれば、符号化対象フレームの動き予測方式の早期判定の効率を向上することができる。
ADVANTAGE OF THE INVENTION According to this invention, the efficiency of the early determination of the motion estimation system of an encoding object frame can be improved.

以下、本発明の実施形態について説明する。 Hereinafter, embodiments of the present invention will be described.

（第１の実施形態）
図１は、第１の実施形態の動画像符号化装置を示すブロック図である。H.264/MPEG-4 AVCに準拠した動画像の符号化処理を行う。本実施形態の動画像符号化装置は、主な構成として差分値算出部１０１、直交変換部１０２、量子化部１０３、動き予測部１０４Ａ、逆量子化部１０５、逆直交変換部１０６、フレームメモリ１０７、加算部１０８、エントロピー符号化部１０９を備える。 (First embodiment)
FIG. 1 is a block diagram showing a moving picture encoding apparatus according to the first embodiment. Performs video encoding processing compliant with H.264 / MPEG-4 AVC. The moving image coding apparatus according to the present embodiment includes a difference value calculation unit 101, an orthogonal transformation unit 102, a quantization unit 103, a motion prediction unit 104A, an inverse quantization unit 105, an inverse orthogonal transformation unit 106, a frame memory as main components. 107, an adder 108, and an entropy encoder 109.

フレームメモリ１０７は、符号化済みのフレームの復号画像をフレーム単位で蓄積する。動き予測部１０４Ａは、フレームメモリ１０７に蓄積されたすでに符号化された参照画像フレームから、符号化対象画像フレームの、各ブロックの動きを検出し動きベクトルと動き予測信号を得る。差分値算出部１０１は、符号化対象の動画像の信号に対して、動き予測部１０４Ａが算出した動き予測信号分を引き、得た差分信号である動き補償予測誤差信号を出力する。 The frame memory 107 stores decoded images of encoded frames in units of frames. The motion prediction unit 104A detects the motion of each block of the encoding target image frame from the already encoded reference image frame stored in the frame memory 107, and obtains a motion vector and a motion prediction signal. The difference value calculation unit 101 subtracts the motion prediction signal calculated by the motion prediction unit 104A from the video signal to be encoded, and outputs a motion compensation prediction error signal that is the obtained difference signal.

直交変換部１０２は、差分値算出部１０１の出力する差分値（動き補償予測誤差信号）を離散コサイン変換する。量子化部１０３は、離散コサイン変換により得られた変換係数を量子化して出力する。離散コサイン変換は画像を空間周波数成分に分解する直交変換の１つの種類であり、直交変換方式は他にも種々のものが知られている。他の方式に適用する場合など、必要に応じて他の直交変換手段を用いて良い。エントロピー符号化部１０９は、量子化部１０３によって量子化して出力された変換係数を可変長符号化する。 The orthogonal transform unit 102 performs a discrete cosine transform on the difference value (motion compensation prediction error signal) output from the difference value calculation unit 101. The quantization unit 103 quantizes the transform coefficient obtained by the discrete cosine transform and outputs it. Discrete cosine transform is one type of orthogonal transform that decomposes an image into spatial frequency components, and various other orthogonal transform methods are known. Other orthogonal transforming means may be used as necessary, such as when applying to other systems. The entropy encoding unit 109 performs variable length encoding on the transform coefficient quantized by the quantization unit 103 and output.

逆量子化部１０５は、量子化部１０３によって量子化して出力された変換係数を逆量子化して、離散コサイン変換係数に変換する。逆直交変換部１０６は、逆量子化部１０５が出力した離散コサイン変換係数を逆離散コサイン変換して、動き予測残差の復号画像を生成する。加算部１０８は、予測値に予測残差の値を加えて符号化の結果である画素値を再現した予測残差の復号画像を得てフレームメモリ１０７に保持させる。 The inverse quantization unit 105 inversely quantizes the transform coefficient quantized and output by the quantization unit 103 and transforms it into a discrete cosine transform coefficient. The inverse orthogonal transform unit 106 performs inverse discrete cosine transform on the discrete cosine transform coefficient output from the inverse quantization unit 105 to generate a decoded image of the motion prediction residual. The addition unit 108 adds the prediction residual value to the prediction value, obtains a decoded image of the prediction residual that reproduces the pixel value that is the result of encoding, and causes the frame memory 107 to hold the decoded image.

図２は、動き予測部１０４Ａを抜き出して示したブロック図である。 FIG. 2 is a block diagram showing the motion prediction unit 104A.

動き予測部１０４Ａは、動きベクトル計算部２０１、評価値計算部２０２、符号量計算部２０３、閾値設定部２０４、比較部２０５、決定部２０６、動き予測計算部２０７、方式選択部２０８を有する。 The motion prediction unit 104A includes a motion vector calculation unit 201, an evaluation value calculation unit 202, a code amount calculation unit 203, a threshold setting unit 204, a comparison unit 205, a determination unit 206, a motion prediction calculation unit 207, and a method selection unit 208.

動きベクトル計算部２０１は符号化対象画像フレームの過去に符号化した周囲のブロックの動きベクトルの中央値より求めるメディアン予測によって、動きベクトルを算出する。 The motion vector calculation unit 201 calculates a motion vector by median prediction obtained from the median of motion vectors of surrounding blocks encoded in the past of the encoding target image frame.

評価値計算部２０２は、符号化対象画像のブロックのダイレクトモードによる動き予測残差の絶対値和を算出する。符号化対象のフレームメモリ１０７に保持された既に符号処理がなされた参照フレームの復号画像内の動きベクトルが指すブロック部分と、符号化対象の動画像信号との画素ごとの差分絶対値和をダイレクトモードの評価値として計算する。 The evaluation value calculation unit 202 calculates the absolute value sum of the motion prediction residuals in the direct mode of the block of the encoding target image. Directly the sum of absolute differences for each pixel between the block portion indicated by the motion vector in the decoded image of the reference frame that has already been encoded and held in the encoding target frame memory 107, and the moving image signal to be encoded Calculate as the mode evaluation value.

符号量計算部２０３は、参照フレームの符号化の際に方式選択部２０８により選択された動き予測方式に基づいて求められた動きベクトルの予測符号量を算出する。 The code amount calculation unit 203 calculates a prediction code amount of a motion vector obtained based on the motion prediction method selected by the method selection unit 208 when the reference frame is encoded.

閾値設定部２０４は、符号化対象ブロックの動き予測方式をダイレクトモードと早期判定するための閾値を算出する。まず、参照フレームのブロックごとに動きベクトルの符号量を算出し、算出された動きベクトルの符号量から平均値を算出する。算出された平均値に基づき、符号化対象画像の動き予測方式を決定する閾値を算出する。算出方法については後述する。 The threshold value setting unit 204 calculates a threshold value for early determination of the motion prediction method of the encoding target block as the direct mode. First, the code amount of the motion vector is calculated for each block of the reference frame, and the average value is calculated from the calculated code amount of the motion vector. Based on the calculated average value, a threshold value for determining the motion prediction method of the encoding target image is calculated. The calculation method will be described later.

比較部２０５は、評価値計算部２０２が算出した評価値と、閾値設定部２０４が算出した閾値の比較を行う。その比較の結果を決定部２０６に出力する。 The comparison unit 205 compares the evaluation value calculated by the evaluation value calculation unit 202 with the threshold calculated by the threshold setting unit 204. The comparison result is output to the determination unit 206.

決定部２０６は、比較部２０５の結果から符号化対象画像の動き予測方式としてダイレクトモードを選択するか否かを決定する。評価値が閾値より小さい場合は、方式選択部２０８にダイレクトモード選択を設定する。また、評価値が閾値より大きい場合は、他の動き予測方式を選択させるように予測計算部２０７に求める。 The determination unit 206 determines whether or not to select the direct mode as the motion prediction method of the encoding target image from the result of the comparison unit 205. When the evaluation value is smaller than the threshold value, direct mode selection is set in the method selection unit 208. If the evaluation value is larger than the threshold value, the prediction calculation unit 207 is requested to select another motion prediction method.

動き予測計算部２０７は、ダイレクトモードが選択されなかった際にダイレクトモード以外の動き予測方式の評価及び決定を行う。16x16、8x16、16x8サイズの動き予測方式のそれぞれの評価値を方式選択部２０８へ出力する。なお、ここで計算されるダイレクトモード以外の動き予測方式の種類は、規格に準じた非特許文献２に記載されている方式の任意のものであって構わない。 The motion prediction calculation unit 207 evaluates and determines a motion prediction method other than the direct mode when the direct mode is not selected. The evaluation values of the 16x16, 8x16, and 16x8 size motion prediction methods are output to the method selection unit 208. Note that the type of motion prediction method other than the direct mode calculated here may be any of the methods described in Non-Patent Document 2 according to the standard.

方式選択部２０８は、決定部２０６よりダイレクトモードが選択された場合は、符号化対象フレームの動き予測方式をダイレクトモードと定める。また、それ以外の場合には、評価値計算部２０２と動き予測計算部２０７が求める評価値が最小となる動き予測方式を採用し、該当方式の動きベクトル及び差分画像信号を出力する。
次に、本実施形態の動画像符号化装置の動作について説明する。 When the direct mode is selected by the determination unit 206, the method selection unit 208 determines the motion prediction method of the encoding target frame as the direct mode. In other cases, a motion prediction method that minimizes the evaluation value obtained by the evaluation value calculation unit 202 and the motion prediction calculation unit 207 is adopted, and a motion vector and a difference image signal of the corresponding method are output.
Next, the operation of the moving picture encoding apparatus of this embodiment will be described.

図３は、動き予測部１０４Ａの閾値設定の処理内容を示すフローチャートである。早期判定を実行する符号化対象フレームが参照するフレームの内、図７の下段で示されるような符号化順で直前のフレームの符号化終了後に実行される。 FIG. 3 is a flowchart showing the processing contents of threshold setting of the motion prediction unit 104A. Of the frames that are referred to by the encoding target frame for which early determination is performed, this processing is executed after the encoding of the immediately preceding frame in the encoding order as shown in the lower part of FIG.

（ステップＳ１０１）
ステップＳ１０１では、符号化対象フレームの参照フレームのうちいずれかひとつのフレームの動きベクトルの予測符号量を算出する。参照フレームのマクロブロックごとの動きベクトルの予測符号量costを求める。動きベクトルの予測符号量costは（１）式によって算出される。

(Step S101)
In step S101, the prediction code amount of the motion vector of any one of the reference frames of the encoding target frame is calculated. The prediction code amount cost of the motion vector for each macroblock of the reference frame is obtained. The prediction code amount cost of the motion vector is calculated by equation (1).

ここでλは量子化パラメータに依存する所定の係数であり、mvx、mvyはそれぞれ動きベクトルのx方向、y方向の成分、mvpredx、mvpredyはいずれかの予測方式で算出された動き予測ベクトルのx方向、y方向の成分である。 Where λ is a predetermined coefficient depending on the quantization parameter, mvx and mvy are the x-direction and y-direction components of the motion vector, respectively, and mvpredx and mvpredy are the x of the motion prediction vector calculated by one of the prediction methods. It is a component of direction and y direction.

参照フレームのすべてのブロックのcostの和を取り、その和を１フレーム内のマクロブロック数で除算することによって平均値cost_aveを算出する。 An average value cost_ave is calculated by taking the sum of the costs of all the blocks in the reference frame and dividing the sum by the number of macroblocks in one frame.

（ステップＳ１０２）
次に、閾値計算部２０４で、参照フレームの動きベクトルの予測符号量の平均値cost_aveに基づき、（２）式に基づきthres_medを算出する。

(Step S102)
Next, the threshold calculation unit 204 calculates thres_med based on Equation (2) based on the average value cost_ave of the prediction code amount of the motion vector of the reference frame.

（ステップＳ２０３）
次に、閾値計算部２０４がダイレクトモードを判定するための閾値thresの設定を行う。閾値thresは、（３）式に基づき設定される。

(Step S203)
Next, the threshold value calculation unit 204 sets a threshold value thres for determining the direct mode. The threshold value thres is set based on equation (3).

ここで、THRES_MINは所定の定数であり、
ＭＡＸ(x,y)はx,yのうち大きな値の方を選択する関数である。したがって、ステップＳ１０２で算出されたthres_medと、あらかじめ閾値thresの下限値として設定されたTHRES_MINのうちいずれか大きい方が選択され、閾値thresとして設定される。 Where THRES_MIN is a predetermined constant,
MAX (x, y) is a function that selects the larger value of x and y. Therefore, the larger one of thres_med calculated in step S102 and THRES_MIN set in advance as the lower limit value of the threshold value thres is selected and set as the threshold value thres.

（２）式に示すように、thres_medは、符号化対象画像の動きの複雑度を示す評価値であるcost_aveに基づき算出されている。 As shown in equation (2), thres_med is calculated based on cost_ave, which is an evaluation value indicating the complexity of the motion of the encoding target image.

したがって、動きが複雑なほど、動き予測の性能が落ち、評価値であるcost_aveの値が大きくなる。それに伴いcost_aveの値が大きくなるにしたがって、thres_medの値も大きくなる。逆に動きの複雑度が低い場合、評価値であるcost_aveの値が小さくなり、それに伴い閾値thres_medの値も小さくなる。その場合、ダイレクトモードによって予測された動きベクトルの信頼性が高いと見なせる。cost_aveとダイレクトモードの残差に基づく評価値とは対数の関係があることがわかっているため、（２）式では閾値を設定するための計算式として対数を用いている。 Therefore, the more complex the motion, the lower the performance of motion prediction, and the value of cost_ave, which is an evaluation value, increases. Accordingly, as the value of cost_ave increases, the value of thres_med also increases. Conversely, when the complexity of motion is low, the value of cost_ave, which is the evaluation value, decreases, and the value of the threshold thres_med also decreases accordingly. In that case, it can be considered that the reliability of the motion vector predicted by the direct mode is high. Since it is known that there is a logarithmic relationship between cost_ave and the evaluation value based on the residual in the direct mode, the logarithm is used as a calculation formula for setting the threshold in the formula (2).

画像劣化とダイレクトモードを早期決定する閾値thres_medとの関係を調べると、フレームによらず、ある程度の値を閾値にしたとしても画像劣化があまり生じないことがわかっている。そのため（３）式では、所定の値THRES_MINよりthres_medが小さい場合は、強制的に動き予測方式をダイレクトモードに早期決定を行うために閾値をTHRES_MINと設定する。それによって、ダイレクトモードによる予測方式の信頼性が高い場合に閾値thresが小さくなることでその他の最終的に採用されない予測方式の評価計算がなされないようにすることが出来る。 Examining the relationship between the image degradation and the threshold value thres_med for determining the direct mode at an early stage, it is known that image degradation does not occur much even if a certain value is set as the threshold value regardless of the frame. Therefore, in equation (3), when thres_med is smaller than a predetermined value THRES_MIN, the threshold is set to THRES_MIN in order to forcibly determine the motion prediction method in the direct mode. As a result, when the reliability of the prediction method in the direct mode is high, the threshold thres is reduced, so that evaluation calculation of other prediction methods that are not finally adopted can be prevented.

図４は、動き予測部１０４Ａのダイレクトモード早期決定を行う方法を示すフローチャートである。 FIG. 4 is a flowchart illustrating a method of performing the direct mode early determination of the motion prediction unit 104A.

（ステップＳ２０１）
まず、動きベクトル計算部２０１はダイレクトモードでの符号化対象画像の動きベクトルを予測する。評価値計算部２０２は、符号化対象画像のブロックのダイレクトモードで算出された動きベクトルに対応する参照フレームの復号画像のブロックとの差分絶対値和BD_SADを計算する。BD_SADは（４）式に基づいて算出される。

(Step S201)
First, the motion vector calculation unit 201 predicts a motion vector of an encoding target image in the direct mode. The evaluation value calculation unit 202 calculates a difference absolute value sum BD_SAD from the decoded image block of the reference frame corresponding to the motion vector calculated in the direct mode of the block of the encoding target image. BD_SAD is calculated based on equation (4).

ここで、（x,y）はフレームの画素の位置に対応し、Cur（x,y）は符号化対象画像の（x,y）での画素値である。mvx、mvyはそれぞれダイレクトモードの動きベクトルのx方向、y方向成分を示すものである。Ref（x+mvx,y+mvy）は参照画像の復号画像の（x+mvx,y+mvy）の位置の画素値である。符号化対象フレームのマクロブロック毎に実行される。算出された差分絶対値和BD_SADを符号化対象フレームのブロックのダイレクトモードの評価値とする。 Here, (x, y) corresponds to the pixel position of the frame, and Cur (x, y) is a pixel value at (x, y) of the encoding target image. mvx and mvy indicate the x-direction and y-direction components of the direct mode motion vector, respectively. Ref (x + mvx, y + mvy) is a pixel value at the position (x + mvx, y + mvy) of the decoded image of the reference image. This is executed for each macroblock of the encoding target frame. The calculated sum of absolute differences BD_SAD is used as the evaluation value of the direct mode of the block of the encoding target frame.

なお、ここでは予測残差の絶対値和を評価値として利用したが、予測残差にさらにアダマール変換を施し、その絶対値和を算出したもの（SATD）や、２乗誤差和（SSD）を用いてもよい。 Here, the absolute value sum of the prediction residual is used as the evaluation value, but the Hadamard transform is further applied to the prediction residual and the absolute value sum is calculated (SATD) or the square error sum (SSD). It may be used.

（ステップＳ２０２）
次に、比較部２０５で、ステップS２０２で算出された閾値thresと評価値BD_SADとを比較する。評価値BD_SADが閾値thresよりも小さい場合は、ステップS２０３に進む。また、BD_SADが閾値thresよりも大きい場合はステップＳ２０４に進む。 (Step S202)
Next, the comparison unit 205 compares the threshold value thres calculated in step S202 with the evaluation value BD_SAD. If the evaluation value BD_SAD is smaller than the threshold value thres, the process proceeds to step S203. If BD_SAD is larger than the threshold value thres, the process proceeds to step S204.

（ステップＳ２０３）
評価値BD_SADが閾値thresよりも小さい場合は、早期判定部２０６はそのブロックの動き予測方式をダイレクトモードに定める。動き予測方式をダイレクトモードに定めた旨を方式選択部２０８に通知する。 (Step S203)
When the evaluation value BD_SAD is smaller than the threshold value thres, the early determination unit 206 sets the motion prediction method of the block to the direct mode. The system selection unit 208 is notified that the motion prediction system is set to the direct mode.

（ステップＳ２０４）
一方、動き予測計算部２０７は、ダイレクトモード以外の動き予測方式の計算を続けて行い、各方式より算出される評価値を方式選択部２０８に出力する。方式選択部２０８はそれらの評価値に基づき、評価値が最小となる動き予測方式を採用する。そして該当方式の動きベクトル及び差分画像信号を出力する。 (Step S204)
On the other hand, the motion prediction calculation unit 207 continues to calculate motion prediction methods other than the direct mode, and outputs an evaluation value calculated from each method to the method selection unit 208. Based on these evaluation values, the method selection unit 208 adopts a motion prediction method that minimizes the evaluation value. Then, the motion vector and the difference image signal of the corresponding method are output.

図７は、H.264の予測構造の例を示す図である。上図は、再生時の表示順番を時系列に沿って示したものである。下図はフレームを符号化する際の順番を示した図である。 FIG. 7 is a diagram illustrating an example of an H.264 prediction structure. The upper diagram shows the display order during reproduction along the time series. The figure below shows the order in which frames are encoded.

フレームB１の符号化の前に参照フレームであるフレームP3の符号化処理がなされている。フレームB1の動き予測方式を決定する際に、ステップＳ１０１でフレームP3の動きベクトルの予測符号量を算出する。また、ステップS２０１でフレームP3とフレームB1との予測残差の絶対値和を求める。 Before the frame B1 is encoded, the frame P3 that is the reference frame is encoded. When determining the motion prediction method of the frame B1, the prediction code amount of the motion vector of the frame P3 is calculated in step S101. In step S201, the absolute value sum of the prediction residuals of the frame P3 and the frame B1 is obtained.

本実施形態の動画像符号化装置では、符号化対象画像の動きの複雑度と相関のある、参照フレームの動きベクトルの予測符号量の平均値とから閾値計算を行っている。参照フレームの動きベクトルの符号量の平均値を用いることで、画面のマクロレベルでの動きの複雑度を評価する。それによって、動き予測の性能を反映した閾値を符号化対象画像に対して設定することが可能となる。 In the moving picture coding apparatus according to the present embodiment, threshold calculation is performed from the average value of the prediction code amounts of the motion vectors of the reference frame, which is correlated with the motion complexity of the coding target picture. By using the average value of the code amount of the motion vector of the reference frame, the motion complexity at the macro level of the screen is evaluated. Thereby, a threshold value reflecting the performance of motion prediction can be set for the encoding target image.

また、ダイレクトモードの評価値と閾値との大小比較により動き予測方式の１つであるダイレクトモードを早期決定することで、その他の動き予測の計算を高効率で省略することが可能となる。 In addition, by quickly determining the direct mode, which is one of the motion prediction methods, by comparing the evaluation value of the direct mode with a threshold value, other motion prediction calculations can be omitted with high efficiency.

（第２の実施形態）
第２の実施形態の、動画像符号化装置について詳細に説明する。ここで、図２と同じ部分についての重複説明は省略する。 (Second Embodiment)
A moving image encoding apparatus according to the second embodiment will be described in detail. Here, a duplicate description of the same parts as those in FIG. 2 is omitted.

図５は、第２の実施形態の動画像符号化装置のうち動き予測部１０４Ａを抜き出して示したブロック図である。 FIG. 5 is a block diagram showing the motion prediction unit 104A extracted from the video encoding apparatus according to the second embodiment.

本実施形態図２の動き予測部１０４Ａは、閾値計算式設定部３０９をさらに有する。 In this embodiment, the motion prediction unit 104A in FIG. 2 further includes a threshold calculation formula setting unit 309.

閾値計算式設定部３０９は符号化対象フレームの符号化が終了する毎に、閾値計算部３０４の（２）式に示される計算式のパラメータ（a,b）を再計算する。具体的には、評価値計算部３０２で計算された評価値BD_SADをフレーム内全てのブロックに対して和を取り、マクロブロック数で割った値を入力として、閾値計算部３０４の計算式の係数を再設定する。また、閾値計算式設定部３０９はシーンチェンジを検出する機能も有する。 The threshold calculation formula setting unit 309 recalculates the parameters (a, b) of the calculation formula shown in the formula (2) of the threshold calculation unit 304 every time encoding of the encoding target frame is completed. Specifically, the evaluation value BD_SAD calculated by the evaluation value calculation unit 302 is summed for all the blocks in the frame, and the value obtained by dividing the sum by the number of macroblocks is input, and the coefficient of the calculation formula of the threshold value calculation unit 304 To reset. The threshold calculation formula setting unit 309 also has a function of detecting a scene change.

このとき、閾値thres_medの係数a,bは、（５）式で設定される。

At this time, the coefficients a and b of the threshold value thres_med are set by Expression (5).

BD_SAD_AVGは、符号化処理のなされたフレームの評価値BD_SADの平均である。BD_SAD_AVGは、ブロック毎の評価値BD_SADを１フレーム分の和をとり、フレーム内のブロック数で除算することで算出される。また、countはシーケンスに対して（５）式に基づく閾値計算式の再設定が行われた回数である。再設定が行われた回数countが増える毎に、（２）式に示される閾値を計算する計算式のa,bの変更幅を徐々に狭くしている。 BD_SAD_AVG is an average of the evaluation values BD_SAD of the frames subjected to the encoding process. BD_SAD_AVG is calculated by taking the sum of the evaluation value BD_SAD for each block for one frame and dividing it by the number of blocks in the frame. Further, count is the number of times that the threshold calculation formula based on the formula (5) is reset for the sequence. Each time the reset count count is increased, the range of change of a and b in the calculation formula for calculating the threshold shown in equation (2) is gradually narrowed.

動きベクトル符号量とダイレクトモードの性能の関係は同一シーン内での変動は少ないため、同一シーンに対して変更幅を徐々に狭くすることで閾値の安定化を図っている。シーンチェンジにより、ダイレクトモードの性質が大きく関係が変わることがあるため、シーンチェンジ点ではcountを０に初期化する方が好ましい。 Since the relationship between the motion vector code amount and the direct mode performance hardly varies within the same scene, the threshold is stabilized by gradually narrowing the change width for the same scene. Since the nature of the direct mode may change greatly depending on the scene change, it is preferable to initialize count to 0 at the scene change point.

図６は、動き予測部１０４Ａの閾値の計算式の再設定の処理内容を示すフローチャートであり、符号化対象フレームの符号化後に実行される。 FIG. 6 is a flowchart showing the processing details of resetting the threshold calculation formula of the motion prediction unit 104A, which is executed after encoding the encoding target frame.

（ステップＳ３０１）
まず、符号化時に評価値計算部２０２で算出されたブロックごとのダイレクトモードの評価値BD_SADの和をマクロブロック数で割ることで平均値BD_SAD_AVGを算出する。 (Step S301)
First, the average value BD_SAD_AVG is calculated by dividing the sum of the evaluation values BD_SAD of the direct mode for each block calculated by the evaluation value calculation unit 202 at the time of encoding by the number of macroblocks.

（ステップＳ３０２）
次に、閾値計算式設定部３１０で、BD_SAD_AVGと（２）式に記載のthres_medと（１）式に記載のcostの平均値であるcost_aveを元に、閾値計算式である（２）式の再設定を行う。このとき、閾値計算式の再設定は、（５）式で設定される。 (Step S302)
Next, in the threshold calculation formula setting unit 310, based on BD_SAD_AVG, thres_med described in formula (2) and cost_ave which is the average value of cost described in formula (1), Perform resetting. At this time, resetting of the threshold calculation formula is set by formula (5).

本実施形態の動画像符号化装置では、閾値計算式（２）式を、実際に動き予測で計算されたダイレクトモードの評価値の平均値を用いて再設定している。動きベクトルの予測符号量の平均値とダイレクトモードの性能の関係はシーケンス内でも変動が大きい。シーンが変化する毎に参照するフレームの動きベクトルの予測符号量の平均値と符号化対象フレームのダイレクトモードとの性能の関係を動的に閾値に反映することで、より高効率で動き予測方式の早期決定が可能となる。 In the moving picture coding apparatus according to the present embodiment, the threshold calculation formula (2) is reset using the average value of the direct mode evaluation values actually calculated by motion prediction. The relationship between the average value of the motion vector prediction code amount and the performance of the direct mode varies greatly even within the sequence. A motion prediction method with higher efficiency by dynamically reflecting the relationship between the average value of the prediction code amount of the motion vector of the frame referenced every time the scene changes and the direct mode of the encoding target frame in the threshold. Can be determined early.

それぞれの実施形態について、上記の例に限定されるものでなく、種々変形して実施可能である。例えば、符号量計算部２０３で計算される値は、エントロピー符号化部１０９より出力される動きベクトル情報の符号化データの量で代用しても構わない。また、評価値計算部２０２及び他の動き予測方式計算部で計算される評価値はアダマール変換により周波数変換して得られる周波数成分画像の絶対値の和であるSATDとしても構わない。 Each embodiment is not limited to the above example, and various modifications can be made. For example, the value calculated by the code amount calculation unit 203 may be substituted with the amount of encoded data of motion vector information output from the entropy encoding unit 109. The evaluation value calculated by the evaluation value calculation unit 202 and other motion prediction method calculation units may be SATD that is the sum of absolute values of frequency component images obtained by frequency conversion by Hadamard transform.

また、動きベクトル計算部２０１で算出される動きベクトルは、周囲のブロックの動きベクトルの中央値である非特許文献２で記載されているダイレクトモードの動きベクトルでなく、周囲のブロックの動きベクトルの平均値であっても構わない。 In addition, the motion vector calculated by the motion vector calculation unit 201 is not the motion vector of the direct mode described in Non-Patent Document 2, which is the median value of the motion vectors of the surrounding blocks, but the motion vector of the surrounding blocks. It may be an average value.

また、（１）式で示される動きベクトルの予測符号量costの平均値を算出する参照フレームは、図７の下段で示される符号化順で直前の参照フレームでなく、図７の上段で示される表示順で一番近いフレームであっても構わない。
In addition, the reference frame for calculating the average value of the prediction code amount cost of the motion vector represented by the expression (1) is not the reference frame immediately before in the coding order shown in the lower part of FIG. 7 but the upper part of FIG. It may be the closest frame in the displayed order.

第１の実施形態の動画像符号化装置を示す機能ブロック図。The functional block diagram which shows the moving image encoder of 1st Embodiment. 第１の実施形態の動画像符号化装置の動き予測部１０４Ａを示す機能ブロック図。The functional block diagram which shows 104 A of motion estimation parts of the moving image encoder of 1st Embodiment. 第１の実施形態の閾値計算の動作を示すフローチャート。The flowchart which shows the operation | movement of the threshold value calculation of 1st Embodiment. 第１の実施形態のモードの早期決定の動作を示すフローチャート。The flowchart which shows the operation | movement of the early determination of the mode of 1st Embodiment. 第２の実施形態の動画像符号化装置の動き予測部１０４Ａを示すブロック図。The block diagram which shows 104 A of motion estimation parts of the moving image encoder of 2nd Embodiment. 第２の実施形態の閾値計算式の再設定の動作を示すフローチャート。10 is a flowchart illustrating an operation of resetting a threshold value calculation formula according to the second embodiment. H.264/AVCの予測構造の例を示す図。The figure which shows the example of the prediction structure of H.264 / AVC.

Explanation of symbols

１０１…差分値算出部
１０２…直行変換部
１０３…量子化部
１０４…動き予測部
１０５…逆量子化部
１０６…逆直行変換部
１０７…フレームメモリ
１０８…加算部
１０９…エントロピー符号化部
２０１、３０１…動きベクトル計算部
２０２、３０２…評価値計算部
２０３、３０３…符号量計算部
２０４、３０４…閾値設定部
２０５、３０５…比較部
２０６、３０６…決定部
２０７、３０７…動き予測計算部
２０８、３０８…方式選択部
３０９…閾値計算式設定部 101 ... difference value calculation unit 102 ... direct transformation unit 103 ... quantization unit 104 ... motion prediction unit 105 ... inverse quantization unit 106 ... inverse orthogonal transformation unit 107 ... frame memory 108 ... addition unit 109 ... entropy coding unit 201, 301 ... motion vector calculation unit 202, 302 ... evaluation value calculation unit 203,303 ... code amount calculation unit 204,304 ... threshold setting unit 205,305 ... comparison unit 206,306 ... determination unit 207,307 ... motion prediction calculation unit 208, 308... Method selection unit 309... Threshold calculation formula setting unit

Claims

In an image encoding apparatus for image information that performs encoding of image information by selecting a motion compensation mode for each encoding target block of an encoding target frame from a plurality of motion compensation methods,
A first calculation unit that calculates a code amount average value of a code amount of a motion vector for each block of a reference frame of the encoding target frame;
A second calculation unit that calculates a threshold value based on the average code amount calculated by the first calculation unit for each reference frame;
A motion vector calculation unit for obtaining a motion vector of an encoding target block based on a first motion prediction method using a motion vector of a plurality of encoded blocks adjacent to the encoding target block;
A third calculation unit that calculates an evaluation value based on a motion prediction residual of a motion vector obtained by the motion vector calculation unit of the encoding target block;
A comparison unit for comparing the threshold value and the evaluation value;
A determination unit that determines a motion vector of the encoding target block as a motion vector of the first motion prediction method when the evaluation value is smaller than the threshold;
A motion prediction calculation unit that performs motion prediction of the coding target block based on a second motion prediction method different from the first motion prediction method when the evaluation value is equal to or greater than the threshold;
A method selection unit for selecting a motion prediction method from the obtained motion prediction method;
A moving picture encoding apparatus comprising: an encoding unit that encodes an encoding target block based on a selected motion prediction method.

The moving image encoding apparatus according to claim 1, wherein the second calculation unit calculates the threshold value such that the threshold value increases as the code amount average value increases.

The code based on a value calculated from an evaluation value average value that is an average value of the evaluation values of encoded frames that have been subjected to encoding processing, and the code amount average value of a reference frame of the encoded frame A calculation formula setting unit that sets, as a calculation formula, a linear function related to the average code amount of the reference frame of the frame to be converted;
The moving image encoding apparatus according to claim 2, wherein the second calculation unit calculates the threshold based on the calculation formula.

4. The moving picture encoding apparatus according to claim 3, wherein the calculation formula set by the calculation formula setting unit is a linear function relating to a logarithm of the code amount average value.

5. The moving picture encoding apparatus according to claim 4, wherein the calculation formula setting unit reduces the change in the slope and intercept of the linear function as the number of times that the linear function is set increases.

It further comprises a scene detection means for detecting a scene change from the input encoding target frame group,
6. The moving picture coding apparatus according to claim 5, wherein the calculation formula setting unit initializes the number of times the coefficient is set when the scene detection unit detects a scene change.

The third calculation unit calculates the sum of absolute differences between the block of the reference frame indicated by the motion vector of the motion prediction method and the encoding target block as the evaluation value. Video encoding device.

The moving image encoding apparatus according to claim 1, wherein the second calculation unit uses the lower limit value as a threshold value when the threshold value is smaller than a predetermined lower limit value.

2. The moving picture encoding apparatus according to claim 1, wherein the motion vector calculation unit is calculated by median prediction from motion vectors of a plurality of encoded blocks adjacent to the encoding target block.

2. The moving picture encoding apparatus according to claim 1, wherein the motion vector calculation unit calculates an average of motion vectors of a plurality of encoded blocks adjacent to the encoding target block.