JP2000023163A

JP2000023163A - Motion vector detection device

Info

Publication number: JP2000023163A
Application number: JP18401798A
Authority: JP
Inventors: Masaaki Hyodo; 正晃兵頭; Hiroshi Kusao; 寛草尾; Yoichi Fujiwara; 陽一藤原
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1998-06-30
Filing date: 1998-06-30
Publication date: 2000-01-21

Abstract

(57)【要約】【課題】動きベクトル探索のブロックマッチング演算
に用いる画素を間引いて演算量を減らし、かつ演算ユニ
ット間に設置されたデータ遅延器の回路規模も大幅に小
さくすることを可能とする。【解決手段】符号化画素値は符号化メモリ１に格納さ
れた後、ブロック単位に横方向にＰＥ６，７に入力され
る。参照画素値は参照メモリ２に蓄積され、参照レジス
タ４，５に入力され、ＰＥ６，７に入力される。ＰＥ６
では参照レジスタ４，５の出力と符号化画素値との差分
絶対値演算が行われる。ＰＥ７では差分絶対値演算が行
われるとともにデータ遅延器８で５クロック遅延したＰ
Ｅ６の差分絶対値演算結果との加算結果が出力される。
ＰＥ７から符号化ブロックと予測ブロック候補とのＡＥ
が出力される。 (57) [Summary] [PROBLEMS] To reduce the amount of calculation by thinning out pixels used for block matching calculation in motion vector search, and to greatly reduce the circuit scale of a data delay unit installed between calculation units. I do. SOLUTION: An encoded pixel value is stored in an encoding memory 1 and then inputted to PEs 6 and 7 in block units in a horizontal direction. The reference pixel value is stored in the reference memory 2, input to the reference registers 4 and 5, and input to the PEs 6 and 7. PE6
In, the absolute value of the difference between the outputs of the reference registers 4 and 5 and the coded pixel value is calculated. In PE7, a differential absolute value operation is performed, and P
The result of addition with the result of the difference absolute value calculation of E6 is output.
AE of coded block and prediction block candidate from PE7
Is output.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、動画像の動き補償
符号化の際に用いる動きベクトルを、ブロックマッチン
グ法により検出する動きベクトル検出装置に関するもの
で、特にサブサンプリングによってマッチングに用いる
画素数を減らし、演算量、回路規模を小さくした動きベ
クトル検出装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a motion vector detecting device for detecting a motion vector used in motion compensation coding of a moving image by a block matching method, and more particularly, to a method of detecting the number of pixels used for matching by subsampling. The present invention relates to a motion vector estimating device that reduces the number of operations and the circuit scale.

【０００２】[0002]

【従来の技術】近年、動画像符号化方式として、ＭＰＥ
Ｇ１（ＩＳＯ／ＩＥＣ１１１７２）、ＭＰＥＧ２（Ｉ
ＳＯ／ＩＥＣ１３８１８）などの動き補償予測を用い
たフレーム間符号化方式が蓄積、通信、放送の分野で用
いられている。これらの方式においては、動画像シーケ
ンスの各画像を符号化ブロックに分割し、符号化ブロッ
ク毎に参照画像から検出した動きベクトルを用いて予測
ブロックを求める、動き補償予測が行われている。2. Description of the Related Art In recent years, MPE has been used as a moving picture coding method.
G1 (ISO / IEC 11172), MPEG2 (I
An interframe coding method using motion compensation prediction such as SO / IEC 13818) is used in the fields of storage, communication, and broadcasting. In these methods, motion-compensated prediction is performed in which each image of a moving image sequence is divided into coding blocks, and a prediction block is obtained using a motion vector detected from a reference image for each coding block.

【０００３】動きベクトルの検出方式としては、ブロッ
クマッチング法が知られている。ブロックマッチング法
では、動きベクトル探索範囲内で符号化ブロックと同じ
大きさの予測ブロック候補と符号化ブロックとの間の誤
差量を計算する。そして、誤差量が最小となる候補を予
測ブロックとし、予測ブロック位置の符号化ブロック位
置からの相対的なずれ量を動きベクトルとする。As a method of detecting a motion vector, a block matching method is known. In the block matching method, the amount of error between a coded block and a predicted block candidate having the same size as the coded block is calculated within the motion vector search range. Then, the candidate with the smallest error amount is set as the prediction block, and the relative shift amount of the prediction block position from the coded block position is set as the motion vector.

【０００４】ここで、前記誤差量を符号化ブロック内の
画素と探索範囲内の画素との間の差分絶対値和（以下で
はＡＥと呼ぶ）とすると、ＡＥは式（１）で表わされ
る。Here, assuming that the error amount is a sum of absolute differences (hereinafter referred to as AE) between a pixel in a coding block and a pixel in a search range, AE is represented by equation (1).

【０００５】[0005]

【数１】 (Equation 1)

【０００６】式（１）で、Ｒx,yは参照画素データ、Ｔ
x,yは符号化画素データで、ＡＥi,jは符号化ブロックの
画素数が水平Ｈ画素×垂直Ｖ画素の場合の動きベクトル
が（ｉ，ｊ）の予測ブロック候補との誤差量を示してい
る。動きベクトルの探索範囲は水平方向に−Ｋ〜（Ｋ−
１）、垂直方向に−Ｌ〜（Ｌ−１）である。In equation (1), Rx, y is reference pixel data, Tx
x, y is coded pixel data, and AEi, j indicates an error amount from a prediction block candidate whose motion vector is (i, j) when the number of pixels of the coded block is H horizontal pixels × V vertical pixels. I have. The search range of the motion vector is -K to (K-
1), -L to (L-1) in the vertical direction.

【０００７】ブロックマッチング法による動きベクトル
検出装置としては、本出願人が出願した特開平１０−１
３６３７７号公報に記載の技術がある。図１９に、特開
平１０−１３６３７７号公報記載の動きベクトル検出装
置のブロック図を示す。同図は符号化ブロックのブロッ
クサイズが４画素×４画素の場合の例である。図１９は
複数の演算ユニット（図中のＰＥ）１０５〜１１２と演
算ユニット間に設置されたデータ遅延器（図中の１２
Ｄ）１１３〜１１８と参照画素値を順次ずらしながら保
持する参照レジスタ（図中のＲ）１０１〜１０４で構成
されている。符号化ブロックの画素値（以下では符号化
画素値と呼ぶ）は端子１１９から入力され、参照画素値
は端子１２０から入力される。As a motion vector detecting device based on the block matching method, Japanese Patent Application Laid-Open No. 10-1 filed by the present applicant has been proposed.
There is a technique described in Japanese Patent No. 36377. FIG. 19 shows a block diagram of a motion vector detecting device described in Japanese Patent Application Laid-Open No. 10-136377. FIG. 11 shows an example in which the block size of the coding block is 4 pixels × 4 pixels. FIG. 19 shows a plurality of processing units (PEs in the figure) 105 to 112 and a data delay unit (12 in the figure) installed between the processing units.
D) 113 to 118 and reference registers (R in the figure) 101 to 104 for holding reference pixel values while sequentially shifting them. The pixel value of the encoded block (hereinafter, referred to as an encoded pixel value) is input from a terminal 119, and the reference pixel value is input from a terminal 120.

【０００８】図２０に図１９のＰＥの構成を示す。ＰＥ
は符号化画素値を格納する符号化レジスタ１２３〜１２
６、差分絶対値演算器１２７〜１３０、加算器１３１，
１３２を備える。符号化レジスタ１２３〜１２６に１列
分の符号化画素値が格納され、一方で、外部にある図１
９の参照レジスタ１０１〜１０４から参照画素値が順次
入力され、差分絶対値演算器１２７〜１３０で符号化画
素値の１列と参照画素値の１列との誤差量が求められ、
加算器１３１，１３２で加算されて出力される。各ＰＥ
での演算結果は順次後段のＰＥで累積加算される。FIG. 20 shows the structure of the PE shown in FIG. PE
Are encoding registers 123 to 12 for storing encoded pixel values.
6, difference absolute value calculators 127 to 130, adder 131,
132. One column of coded pixel values is stored in the coding registers 123 to 126, while the outside of FIG.
9 are sequentially input from the reference registers 101 to 104, and the difference absolute value calculators 127 to 130 calculate the error amount between one column of the coded pixel values and one column of the reference pixel values.
The signals are added by the adders 131 and 132 and output. Each PE
Are sequentially cumulatively added by the subsequent PEs.

【０００９】以下、図２１（ａ）に示すように符号化ブ
ロックの画素数が水平４画素×垂直４画素、図２１
（ｂ）に示すように動きベクトルの探索範囲が水平方向
に−４〜＋３、垂直方向に−４〜＋３の場合の従来例の
動作を説明する。Hereinafter, as shown in FIG. 21A, the number of pixels of the coding block is 4 pixels in the horizontal direction × 4 pixels in the vertical direction.
The operation of the conventional example when the search range of the motion vector is −4 to +3 in the horizontal direction and −4 to +3 in the vertical direction as shown in FIG.

【００１０】図２２に符号化レジスタに格納される符号
化画素と参照レジスタから入力される参照画素の関係を
示す。符号化レジスタ１２３〜１２６は１つのＰＥに対
し４画素分あり、ＰＥ１０５〜１１２が８個あるため、
符号化画素２ブロック分（４×８画像）の容量があり、
２符号化ブロックの誤差演算を同時に行う。各ＰＥには
１列分（４画素）の符号化画素値が格納され、符号化画
素値が１列分入力される毎に参照画素値も１列分（１２
画素）入力される。参照画素値が入力される間、符号化
画素値は符号化レジスタに保持されている。そして、参
照画素と符号化画素の関係は常に図２２に示した位置関
係となっており、各ＰＥに保持している符号化画素値と
入力される参照画素値との差分絶対値和が求められる。FIG. 22 shows the relationship between encoded pixels stored in the encoding register and reference pixels input from the reference register. Since the encoding registers 123 to 126 have four pixels for one PE and eight PEs 105 to 112,
There is a capacity of 2 blocks of encoded pixels (4 × 8 images),
The error calculation of two encoded blocks is performed simultaneously. Each PE stores encoded pixel values for one column (4 pixels), and every time an encoded pixel value is input for one column, the reference pixel value for one column (12 pixels) is also stored.
Pixel). While the reference pixel value is input, the encoded pixel value is held in the encoding register. The relationship between the reference pixel and the encoded pixel is always the positional relationship shown in FIG. 22, and the sum of absolute differences between the encoded pixel value held in each PE and the input reference pixel value is calculated. Can be

【００１１】図２３及び図２４に誤差量が累積加算され
るタイミングチャートを示す。図２３の後に図２４が続
き、図中の画素の記号は図２１で示したものを用いてい
る。符号化画素値は図１９の端子１１９から１２クロッ
ク毎に１列分（４画素）が入力される。まず符号化画素
値ｔ０，ｔ１，ｔ２，ｔ３が［時刻Ｔ＝０〜３］にＰＥ
１０５に入力され、符号化画素値ｔ４，ｔ５，ｔ６，ｔ
７が［Ｔ＝１２〜１５］にＰＥ１０６に入力される。参
照画素値は図１９の端子１２０からｒ０，ｒ１…の順に
入力される。この時、ｒ０，ｒ１，ｒ２，…ｒ１０，ダ
ミーデータ，ｒ１１，ｒ１３，…ｒ２１，ダミーデー
タ，ｒ２２…というように、１２クロック周期でダミー
データ１画素を挿入する。FIGS. 23 and 24 show timing charts in which the error amounts are cumulatively added. 23 is followed by FIG. 24, and the symbols of the pixels in the figure use those shown in FIG. As the coded pixel value, one column (4 pixels) is input from the terminal 119 in FIG. 19 every 12 clocks. First, the coded pixel values t0, t1, t2, and t3 are set to PE at [time T = 0 to 3].
105, and the encoded pixel values t4, t5, t6, t
7 is input to the PE 106 at [T = 12 to 15]. The reference pixel value is input in the order of r0, r1,... From the terminal 120 in FIG. At this time, one pixel of dummy data is inserted in 12 clock cycles, such as r0, r1, r2,... R10, dummy data, r11, r13,.

【００１２】まず、４クロック分の参照画素値を入力し
た時点で、参照レジスタ１０１〜１０４の出力は（ｒ
０，ｒ１，ｒ２，ｒ３）となる［Ｔ＝４］。よって、Ｐ
Ｅ１０５内部では、ｄ０＝｜r0−t0｜＋｜r1−t1｜＋｜r2−t2｜＋｜r3−t3｜［Ｔ＝４］が一括して計算され、ＰＥ１０５の出力となる。ここ
で、ｄ０は動きベクトル（−４，−４）のＡＥ計算の中
の符号化画素値ｔ０〜ｔ３に関する部分に相当する。次
のクロック［時刻Ｔ＝５］では、参照レジスタ１０１〜
１０４の出力は（ｒ１，ｒ２，ｒ３，ｒ４）となる一
方、符号化画素値の入力は行わずＰＥ１０５の中の符号
化レジスタ１２３〜１２６に格納された符号化画素値は
ｔ０〜ｔ３のままとするので、ＰＥ１０５においてｄ１＝｜r1−t0｜＋｜r2−t1｜＋｜r3−t2｜＋｜r4−t3｜［Ｔ＝５］が計算される。ｄ１は動きベクトル（−４，−３）のＡ
Ｅ計算の中の符号化画素値ｔ０〜ｔ３に関する部分に相
当する。以上の処理を続け、図１９のＰＥ１０５からｄ
０〜ｄ７が順に出力され、データ遅延器１１３に入力さ
れる。First, when the reference pixel values for four clocks are input, the outputs of the reference registers 101 to 104 become (r
0, r1, r2, r3) [T = 4]. Therefore, P
Inside E105, d0 = | r0−t0 | + | r1−t1 | + | r2−t2 | + | r3−t3 | [T = 4] is collectively calculated and becomes the output of the PE 105. Here, d0 corresponds to a portion related to the coded pixel values t0 to t3 in the AE calculation of the motion vector (−4, −4). At the next clock [time T = 5], the reference registers 101 to 101
While the output of 104 is (r1, r2, r3, r4), the input of the encoded pixel value is not performed, and the encoded pixel values stored in the encoding registers 123 to 126 in the PE 105 remain at t0 to t3. Therefore, in the PE 105, d1 = | r1-t0 | + | r2-t1 | + | r3-t2 | + | r4-t3 | [T = 5] is calculated. d1 is A of the motion vector (-4, -3)
This corresponds to a portion related to the coded pixel values t0 to t3 in the E calculation. The above processing is continued, and PE 105 in FIG.
0 to d7 are sequentially output and input to the data delay unit 113.

【００１３】データ遅延器１１３からは１２クロック後
［Ｔ＝１６］から順にｄ０〜ｄ７がＰＥ１０６に入力さ
れる。ＰＥ１０６では符号化レジスタ１２３〜１２６の
出力が（ｔ４，ｔ５，ｔ６，ｔ７）となっており、参照
レジスタ１０１〜１０５は（ｒ１１，ｒ１２，ｒ１３，
ｒ１４）となっている。そして、 e0＝d0＋｜r11−t4｜＋｜r12−t5｜＋｜r13−t6｜＋｜r14−t7｜［Ｔ＝１６］がＰＥ１０６の出力となる。ｅ０は動きベクトル（−
４，−４）のＡＥ計算のうち、符号化画素ｔ０〜ｔ７に
関する部分に相当する。このようにして、１列ずつ誤差
量が累積し、ＰＥ１０８からは予測ブロック候補のＡＥ
（ｈ０〜ｈ７）が出力される。The data delay unit 113 inputs d0 to d7 to the PE 106 in order from [T = 16] after 12 clocks. In the PE 106, the outputs of the encoding registers 123 to 126 are (t4, t5, t6, t7), and the reference registers 101 to 105 are (r11, r12, r13,
r14). Then, e0 = d0 + | r11−t4 | + | r12−t5 | + | r13−t6 | + | r14−t7 | [T = 16] is the output of the PE 106. e0 is the motion vector (-
In the AE calculation of (4, -4), it corresponds to a portion related to the coded pixels t0 to t7. In this way, the error amount is accumulated for each column, and the PE 108
(H0 to h7) are output.

【００１４】なお、ＰＥ１０５〜ＰＥ１０８とＰＥ１０
９〜ＰＥ１１２は隣接する符号化ブロックのＡＥ計算を
並列して行う。そして、１２クロック毎に符号化画素１
列分が入力されるので、４８クロック毎に１符号化ブロ
ック（４列）の入力が終了し、次の符号化ブロックの入
力が開始される。Incidentally, PE105 to PE108 and PE10
9 to PE 112 perform AE calculation of adjacent encoded blocks in parallel. Then, the encoded pixel 1 is output every 12 clocks.
Since columns are input, the input of one encoded block (four columns) ends every 48 clocks, and the input of the next encoded block starts.

【００１５】[0015]

【発明が解決しようとする課題】特開平１０−１３６３
７７号公報には、１つの候補ブロックの誤差演算に用い
る画素を間引くことで、回路規模をより小さくする装置
が記載されている。例えば図２５に示すように、水平、
垂直ともに１／２にサブサンプリングした符号化ブロッ
ク画素についてだけ誤差値を求めるようにすれば、図２
６に示すような回路で差分絶対値演算を行うことができ
る。図２６において、１３３〜１３６は演算ユニット、
１３７は符号化画素を入力する入力端子、１３８は参照
画素を入力する入力端子、１３９，１４０は差分絶対値
を出力する出力端子で、１０１〜１０４，１１３，１１
４，１１６，１１７は図１９と同一の回路である。図２
６は図１９と比べ、ＰＥの数が半分になりデータ遅延器
の数も２つ減っている。さらにＰＥ１３３〜１３６は図
２７に示す回路で、２つのレジスタ１４１，１４２、２
つの差分絶対値演算器１４３，１４４と加算器１４５で
構成され、図２０で示したＰＥと比べて回路規模が約半
分となっている。Problems to be Solved by the Invention
Japanese Patent Publication No. 77 describes an apparatus for reducing the circuit scale by thinning out pixels used for error calculation of one candidate block. For example, as shown in FIG.
If the error value is determined only for the coded block pixel which is sub-sampled to ともに in both the vertical direction, FIG.
The circuit shown in FIG. 6 can calculate the absolute difference. In FIG. 26, 133 to 136 are arithmetic units,
137 is an input terminal for inputting a coded pixel, 138 is an input terminal for inputting a reference pixel, 139 and 140 are output terminals for outputting absolute difference values, and 101 to 104, 113, 11
4, 116 and 117 are the same circuits as in FIG. FIG.
In FIG. 6, the number of PEs is reduced by half and the number of data delay units is reduced by two compared to FIG. Further, PEs 133 to 136 are circuits shown in FIG.
The differential absolute value calculators 143 and 144 and the adder 145 are provided, and the circuit scale is about half as compared with the PE shown in FIG.

【００１６】しかしながら、各データ遅延器の遅延量は
符号化ブロックをサブサンプリングする前と変わりはな
い。特開平１０−１３６３７７号公報記載の特許の実施
例ではデータ遅延器は入力データを１２クロック分遅延
させる。これは垂直方向の動きベクトルの候補点数８と
垂直方向の符号化ブロックの画素数４の和であり、サブ
サンプリングを行っても同一である。ＭＰＥＧ２で一般
的に用いられる値、例えば符号化ブロックの大きさが１
６画素×１６画素、垂直方向の動きベクトルの探索範囲
が−１６〜＋１５画素の場合では、符号化ブロックを１
／２にサブサンプリングした場合においても８個のＰＥ
の間に２×７＝１４個のデータ遅延器を設置し、各デー
タ遅延器は入力データを４８クロック分遅延させること
になる。データ遅延器への入力データが１０ビットとす
るとデータ遅延器の総容量は、４８×１４×１０＝６，７２０ビットになる。一般にデータ遅延器はレジスタ構成であり、
６，７２０ビットのレジスタは非常に回路規模が大きい
という問題があった。However, the delay amount of each data delay unit is the same as before the sub-sampling of the coding block. In the embodiment of the patent described in JP-A-10-136377, the data delay unit delays input data by 12 clocks. This is the sum of the number of candidate points 8 for the vertical motion vector and the number 4 of pixels of the coding block in the vertical direction, and is the same even when sub-sampling is performed. A value generally used in MPEG2, for example, when the size of an encoded block is 1
If the search range of the motion vector in the vertical direction is -16 to +15 pixels, the coding block is 1
8 PEs even when sub-sampled to / 2
In this case, 2 × 7 = 14 data delay units are provided, and each data delay unit delays input data by 48 clocks. Assuming that the input data to the data delay unit is 10 bits, the total capacity of the data delay unit is 48 × 14 × 10 = 6,720 bits. Generally, the data delay unit has a register configuration,
The 6,720-bit register has a problem that the circuit scale is very large.

【００１７】本発明は、このような問題点に鑑みてなさ
れたものであり、動きベクトル探索のブロックマッチン
グ演算に用いる画素を間引いて演算量を減らし、かつ演
算ユニット間に設置されたデータ遅延器の回路規模も大
幅に小さくすることが可能な動きベクトル検出装置を提
供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of such a problem, and reduces the amount of calculation by thinning out pixels used for block matching calculation in a motion vector search, and provides a data delay unit provided between calculation units. It is an object of the present invention to provide a motion vector detecting device capable of greatly reducing the circuit size of the motion vector detecting device.

【００１８】[0018]

【課題を解決するための手段】本発明は、符号化ブロッ
クを構成する画素のうち、所定のパターンでサブサンプ
リングされた代表画素値を用いて、探索領域の全予測ブ
ロック候補についてブロックマッチングを行い、符号化
ブロックに最も合致する予測ブロックを求めて動きベク
トルを検出する動きベクトル検出装置である。According to the present invention, block matching is performed for all prediction block candidates in a search area by using representative pixel values subsampled in a predetermined pattern among pixels constituting an encoding block. , A motion vector estimating apparatus for estimating a motion vector by finding a prediction block that best matches a coding block.

【００１９】請求項１の発明は、符号化ブロックを格納
する符号化メモリと、予測ブロック候補の参照画像値を
格納する参照メモリと、符号化メモリと参照メモリの入
出力を制御するメモリコントローラと、前記参照メモリ
の参照画素値を格納する参照レジスタと、データの出力
タイミングを遅延させる複数のデータ遅延器と、該デー
タ遅延器を介して直列に接続された複数の演算ユニット
とを備える。前記メモリコントローラは、サブサンプリ
ングされた符号化ブロックの代表画素値を前記符号化メ
モリから前記演算ユニットに分散出力させるとともに、
前記サブサンプリングのパターンに基づいて、前記参照
メモリから前記符号化ブロックの代表画素値に対応する
予測ブロック候補の参照画素値を読み出して、前記参照
レジスタに探索領域内の全予測ブロック候補の参照画素
値を順に入力する。前記演算ユニットは、前記サブサン
プリングされた符号化ブロックの画素値を保持し、該符
号化ブロックの画素値と、該画素値に対応する前記参照
レジスタの画素値との間の誤差量を求める演算を行い、
これら誤差量を他の演算ユニットの出力値に累積加算す
る。前記データ遅延器は、前段の演算ユニットを出力し
た誤差値を、後段の演算ユニットで同一予測ブロック候
補における誤差量に累積加算されるように出力タイミン
グを調整する。そして、最終段の演算ユニットを出力し
た誤差量の中で最小値を有する予測ブロック候補を予測
ブロックとして動きベクトルを決定することを特徴とす
る。According to a first aspect of the present invention, there is provided an encoding memory for storing an encoded block, a reference memory for storing a reference image value of a prediction block candidate, and a memory controller for controlling input / output of the encoding memory and the reference memory. A reference register for storing a reference pixel value of the reference memory, a plurality of data delay units for delaying data output timing, and a plurality of operation units connected in series via the data delay unit. The memory controller distributes and outputs the representative pixel value of the sub-sampled encoded block from the encoding memory to the arithmetic unit,
Based on the sub-sampling pattern, read a reference pixel value of a prediction block candidate corresponding to a representative pixel value of the coding block from the reference memory, and store the reference pixels of all prediction block candidates in a search area in the reference register. Enter the values in order. The arithmetic unit holds a pixel value of the subsampled encoded block, and calculates an error amount between a pixel value of the encoded block and a pixel value of the reference register corresponding to the pixel value. Do
These error amounts are cumulatively added to output values of other arithmetic units. The data delay unit adjusts the output timing such that the error value output from the preceding operation unit is cumulatively added to the error amount in the same prediction block candidate by the subsequent operation unit. Then, a motion vector is determined using a prediction block candidate having the minimum value among the error amounts output from the last-stage arithmetic unit as a prediction block.

【００２０】請求項２の発明は、請求項１記載の動きベ
クトル検出装置であって、前記複数のデータ遅延器と前
記複数の演算ユニットを２組備え、一方の組の演算ユニ
ットは既に入力されている符号化ブロックの代表画素値
を保持し、メモリコントローラは、他方の組の演算ユニ
ットに該符号化ブロックに隣接する符号化ブロックの代
表画素値を前記符号化メモリから出力させ、両組の前記
演算ユニットは、共通の予測ブロック候補の参照画素と
前記代表画素値とで演算を行うことを特徴とする。According to a second aspect of the present invention, in the motion vector detecting device according to the first aspect, two sets of the plurality of data delay units and the plurality of arithmetic units are provided, and one set of the arithmetic units is already inputted. The memory controller holds the representative pixel value of the coding block being used, and causes the other set of arithmetic units to output the representative pixel value of the coding block adjacent to the coding block from the coding memory. The arithmetic unit performs an arithmetic operation on a reference pixel of a common prediction block candidate and the representative pixel value.

【００２１】請求項３の発明は、請求項１記載の動きベ
クトル検出装置であって、前記複数のデータ遅延器と前
記複数の演算ユニットを２組備え、メモリコントローラ
は、一方の組の演算ユニットには一つの符号化ブロック
の代表画素値を前記符号化メモリから出力させ、他方の
組みの演算ユニットには該符号化ブロックに対し一つお
きの符号化ブロックの代表画素値を前記符号化メモリか
ら出力させ、両組の前記演算ユニットは、共通の予測ブ
ロック候補の参照画素と前記代表画素値とで演算を行う
ことを特徴とする。According to a third aspect of the present invention, in the motion vector detecting device according to the first aspect, two sets of the plurality of data delay units and the plurality of arithmetic units are provided, and the memory controller is provided with one set of the arithmetic units. Causes the representative pixel value of one encoded block to be output from the encoding memory, and the other set of arithmetic units stores the representative pixel value of every other encoded block with respect to the encoded block. , And the two sets of arithmetic units perform an arithmetic operation on the reference pixel of the common prediction block candidate and the representative pixel value.

【００２２】請求項４の発明は、請求項１、２又は３記
載の動きベクトル検出装置であって、前記演算ユニット
は、サブサンプリングされた符号化ブロックの１辺の画
素数Ｍの符号化画素値を格納するＭ個の符号化レジスタ
と、前記符号化画素値と該画素値に対応する前記参照レ
ジスタの画素値との差分絶対値を演算するＭ個の差分絶
対値演算器と、前段の演算ユニットの出力と前記Ｍ個の
差分絶対値演算器の出力とを加算する加算器と、を備え
ることを特徴とする。According to a fourth aspect of the present invention, there is provided the motion vector detecting device according to the first, second or third aspect, wherein the arithmetic unit comprises a number M of encoded pixels on one side of the sub-sampled encoded block. M encoding registers for storing values, M difference absolute value calculators for calculating the absolute value of the difference between the encoded pixel value and the pixel value of the reference register corresponding to the pixel value, An adder for adding the output of the arithmetic unit and the outputs of the M difference absolute value arithmetic units.

【００２３】請求項５の発明は、請求項４記載の動きベ
クトル検出装置であって、前記参照レジスタは２Ｍ個で
あり、更に参照レジスタからＭ個のデータを選択して出
力するセレクタを備え、該セレクタにより、有効なＭ個
のレジスタ出力のみを選択して演算ユニットに供給し、
符号化ブロック間で重複する探索範囲における誤差演算
を共通の演算ユニットで時分割により行うことを特徴と
する。According to a fifth aspect of the present invention, there is provided the motion vector detecting device according to the fourth aspect, wherein the number of the reference registers is 2M, and further comprising a selector for selecting and outputting M data from the reference registers, The selector selects only valid M register outputs and supplies them to the arithmetic unit.
An error calculation in a search range overlapping between coding blocks is performed by a common calculation unit by time division.

【００２４】請求項６の発明は、請求項１記載の動きベ
クトル検出装置であって、前記演算ユニットは、サブサ
ンプリングされた符号化ブロックの１辺の画素数Ｍの符
号化画素値を格納するＭ個の第１符号化レジスタと、該
符号化画素値を含む符号化ブロックに隣接する符号化ブ
ロックのＭ個の符号化画素を格納するＭ個の第２符号化
レジスタと、合計２Ｍ個の第１及び第２符号化レジスタ
の中からＭ個のレジスタ出力を選択するセレクタと、Ｍ
個の差分絶対値演算器と、他の演算ユニットの出力と前
記差分絶対値演算器の出力とを加算する加算器と、を備
える。前記参照レジスタは２Ｍ個であり、更に参照レジ
スタからＭ個のデータを選択して出力するセレクタを備
える。そして、該セレクタにより、有効なＭ個のレジス
タ出力のみを選択して演算ユニットに供給し、符号化ブ
ロック間で重複する探索範囲における誤差演算を共通の
演算ユニットで時分割により行うことを特徴とする。According to a sixth aspect of the present invention, in the motion vector detecting device according to the first aspect, the arithmetic unit stores the coded pixel values of the number M of pixels on one side of the sub-sampled coded block. M first encoding registers, M second encoding registers for storing M encoded pixels of an encoding block adjacent to the encoding block including the encoded pixel value, and a total of 2M A selector for selecting M register outputs from the first and second encoding registers;
And an adder for adding an output of another arithmetic unit and an output of the absolute difference calculator. The number of the reference registers is 2M, and a selector for selecting and outputting M data from the reference registers is provided. Then, the selector selects only valid M register outputs and supplies them to the arithmetic unit, and performs an error operation in a search range overlapping between coding blocks by time sharing in a common arithmetic unit. I do.

【００２５】請求項７の発明は、請求項４、５、又は６
記載の動きベクトル検出装置であって、前記差分絶対値
演算器の代わりに差分自乗演算器を用いることを特徴と
する。The invention of claim 7 is the invention of claim 4, 5, or 6.
The motion vector detection device according to claim 1, wherein a difference square calculator is used instead of the difference absolute value calculator.

【００２６】[0026]

【発明の実施の形態】以下図面を参照しながら、本発明
の実施の形態を詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００２７】＜第１の実施の形態＞図１は、本発明に係
る動きベクトル検出装置の第１の実施の形態を示すブロ
ック図である。図２６に示した従来例では１２クロック
のデータ遅延器２つを介して直列に接続された演算ユニ
ットの組を２組備え、各演算ユニットの組で１符号化ブ
ロックの誤差演算を行い、２符号化ブロックの誤差演算
を同時に行っていた。これに対し、本実施の形態では参
照画素の入力順を最適化し、かつ高速に入力することで
５クロックのデータ遅延器を介して直列に接続された１
組の演算ユニットだけで処理することに特徴がある。本
実施の形態では従来例に比べて演算ユニットの数が半分
になり、また１２クロックのデータ遅延器４つが５クロ
ックのデータ遅延器１つになる。<First Embodiment> FIG. 1 is a block diagram showing a first embodiment of a motion vector detecting device according to the present invention. In the conventional example shown in FIG. 26, two sets of operation units connected in series via two data delayers of 12 clocks are provided, and each operation unit performs error calculation of one encoding block, and The error calculation of the coding block was performed simultaneously. On the other hand, in the present embodiment, the input order of the reference pixels is optimized and the reference pixels are input at high speed, so that the ones connected in series via the 5-clock data delay unit
It is characterized in that processing is performed only by a set of arithmetic units. In this embodiment, the number of arithmetic units is reduced by half compared to the conventional example, and four 12-clock data delay units are replaced by one 5-clock data delay unit.

【００２８】以下では、従来例と同様に符号化ブロック
のサイズを水平４画素、垂直４画素、動きベクトルの探
索範囲を水平方向−４〜＋３画素、垂直方向−４〜＋３
画素とし、従来例で示した図２５と同様に符号化ブロッ
クを水平、垂直ともに１／２にサブサンプリングした画
素と参照画素のマッチング演算を行い、マッチング演算
としては差分絶対値の和（ＡＥ）を用いる場合について
説明する。Hereinafter, as in the conventional example, the size of the coding block is 4 pixels horizontally and 4 pixels vertically, and the search range of the motion vector is -4 to +3 pixels in the horizontal direction and -4 to +3 in the vertical direction.
A pixel is set as a pixel, and a matching operation is performed between a reference pixel and a pixel obtained by subsampling the coding block in both the horizontal and vertical directions in the same manner as in FIG. 25 shown in the conventional example, and the matching operation is the sum of absolute differences (AE) Will be described.

【００２９】この動きベクトル検出装置は、符号化ブロ
ックを格納する符号化メモリ１、参照ブロックを格納す
る参照メモリ２、符号化メモリ１及び参照メモリ２の入
出力を制御するメモリコントローラ３と、直列に接続し
た参照レジスタ４，５、データ遅延器８を介して直列に
接続した演算ユニット（以下ではＰＥと呼ぶ）６，７か
らなる。各ＰＥ６，７には符号化メモリ１から符号化画
素値が入力されるとともに、参照レジスタ４，５の出力
が共通に入力される。This motion vector detecting device includes a coding memory 1 for storing a coding block, a reference memory 2 for storing a reference block, a memory controller 3 for controlling the input / output of the coding memory 1 and the reference memory 2, and a serial memory. , And arithmetic units (hereinafter referred to as PEs) 6, 7 connected in series via a data delay unit 8. The PEs 6 and 7 receive the coded pixel values from the coding memory 1 and the outputs of the reference registers 4 and 5 in common.

【００３０】図２は図１中のＰＥのブロック図である。
ＰＥ６，７は符号化レジスタ２１，２２と、差分絶対値
演算器２３，２４と、加算器２５とで構成される。符号
化レジスタ２１，２２には符号化メモリ１から符号化画
素値が入力され、差分絶対値演算器２３，２４で符号化
画素値と外部の参照レジスタから入力される参照画素値
の差分絶対値を求め、加算器２５では差分絶対値と前段
からの入力の和が出力される。FIG. 2 is a block diagram of the PE in FIG.
Each of the PEs 6 and 7 includes encoding registers 21 and 22, absolute difference calculators 23 and 24, and an adder 25. The coding registers 21 and 22 receive the coded pixel values from the coding memory 1, and the difference absolute value calculators 23 and 24 calculate the difference absolute value between the coded pixel value and the reference pixel value input from an external reference register. , And the adder 25 outputs the sum of the absolute difference value and the input from the previous stage.

【００３１】図３は図１及び図２に示した実施例におけ
る参照レジスタ４，５及びＰＥ６，７の動きを示すタイ
ミングチャートであり、本図及び図４及び図２５を用い
て動作を説明する。なお、図３において、画素の記号は
図２１に示した記号を用いる。また、従来例では符号化
画素及び参照画素は縦方向に入力したが、本実施例では
横方向に入力する。FIG. 3 is a timing chart showing the operation of the reference registers 4, 5 and the PEs 6, 7 in the embodiment shown in FIGS. 1 and 2. The operation will be described with reference to FIGS. . In FIG. 3, the symbols of the pixels are the symbols shown in FIG. In the conventional example, the coded pixel and the reference pixel are input in the vertical direction, but in the present embodiment, they are input in the horizontal direction.

【００３２】まず、図４及び図２５を用いて符号化画素
値及び参照画素値の入力順を説明する。符号化画素値は
一旦図１の符号化メモリ１に格納された後、ブロック単
位に横方向にＰＥ６，７に入力される。図２５に示した
サブサンプリングパターンの場合、符号化画素値ｔ０，
ｔ８，ｔ２，ｔ１０の順で入力され、ｔ０，ｔ８がＰＥ
６内の符号化レジスタに格納され、ｔ２，ｔ１０がＰＥ
７内の符号化レジスタに格納される。First, the input order of the coded pixel value and the reference pixel value will be described with reference to FIGS. The coded pixel values are temporarily stored in the coding memory 1 of FIG. 1 and then input to the PEs 6 and 7 in the horizontal direction in block units. In the case of the sub-sampling pattern shown in FIG.
t8, t2, and t10 are input in this order, and t0 and t8 are PEs.
6, and t2 and t10 are stored in PE.
7 is stored in the encoding register.

【００３３】一方、参照画素値は一旦図１の参照メモリ
２に蓄積され、図４に示すパターンで参照レジスタ４，
５に入力される。図４は図２４で示した符号化画素のサ
ブサンプリングパターンに基づくパターンであり、まず
１〜２５の参照画素値が番号順に入力され、次に基点を
１画素ずらした１'として同じパターンで２５画素が入
力され、順に１''，１'''と基点をずらして参照画素値
が入力され、図４に示した参照画素読み出し範囲の全画
素である、１０画素×１０画素の入力が終了する。On the other hand, the reference pixel value is temporarily stored in the reference memory 2 shown in FIG.
5 is input. FIG. 4 shows a pattern based on the sub-sampling pattern of the coded pixels shown in FIG. Pixels are input, and reference pixel values are input with reference points shifted in the order of 1 ″, 1 ′ ″, and input of 10 pixels × 10 pixels, which are all pixels in the reference pixel readout range shown in FIG. 4, is completed. I do.

【００３４】この参照画素入力で水平方向 −４〜＋３，垂直方向 −４〜＋３の範囲の全予測ブロック候補に対応する参照画素が入力
される。なお、マッチング演算はサブサンプリングした
符号化画素と行うため、参照画素読み出し範囲は探索範
囲より一回り小さい範囲となる。With this reference pixel input, reference pixels corresponding to all the prediction block candidates in the range of -4 to +3 in the horizontal direction and -4 to +3 in the vertical direction are input. Since the matching operation is performed on the coded pixels subjected to the sub-sampling, the reference pixel readout range is a range slightly smaller than the search range.

【００３５】図３のタイムチャートにおいて、まず符号
化メモリ１から図２５で示したｔ０，ｔ８，ｔ２，ｔ１
０の順で最初の符号化ブロックの画素値が入力され、ｔ
０，ｔ８がＰＥ６内の符号化レジスタに格納され、ｔ
２，ｔ１０がＰＥ７内の符号化レジスタに格納される
［時刻Ｔ＝０〜３］。符号化ブロックは、次の符号化ブ
ロックが入力されるまで符号化レジスタに保持される。
一方、参照メモリ２からは参照レジスタ４，５に参照画
素値の第１行目であるｒ０，ｒ２２，ｒ４４，ｒ６６，
ｒ８８（画素の記号は図２１（ｂ）に図示）が入力され
［Ｔ＝０〜４］、１クロック毎に参照レジスタ４，５の
出力がＰＥ６，７にも入力される。In the time chart of FIG. 3, first, the encoding memory 1 outputs t0, t8, t2, and t1 shown in FIG.
The pixel values of the first coding block are input in the order of 0, and t
0 and t8 are stored in the encoding register in PE6, and t
2 and t10 are stored in the encoding register in the PE 7 [time T = 0 to 3]. The coding block is held in the coding register until the next coding block is input.
On the other hand, from the reference memory 2, the reference registers 4 and 5 store r0, r22, r44, r66,
r88 (the symbol of the pixel is shown in FIG. 21B) is input [T = 0 to 4], and the outputs of the reference registers 4 and 5 are also input to the PEs 6 and 7 every clock.

【００３６】ＰＥ６では時刻Ｔ２から参照レジスタ４，
５の出力と符号化画素値ｔ０，ｔ８との差分絶対値演算
が行われる。よって、ＰＥ６の出力データは、ｄ０＝｜ｔ０−ｒ０｜＋｜ｔ８−ｒ２２｜［Ｔ＝２］ｄ２２＝｜ｔ０−ｒ２２｜＋｜ｔ８−ｒ４４｜［Ｔ＝３］ｄ４４＝｜ｔ０−ｒ４４｜＋｜ｔ８−ｒ６６｜［Ｔ＝４］ｄ６６＝｜ｔ０−ｒ６６｜＋｜ｔ８−ｒ８８｜［Ｔ＝５］となる。In PE6, the reference register 4,
5 and the encoded pixel values t0 and t8 are calculated. Therefore, the output data of PE6 is: d0 = | t0-r0 | + | t8-r22 | [T = 2] d22 = | t0-r22 | + | t8-r44 | [T = 3] d44 = | t0-r44 | + | T8−r66 | [T = 4] d66 = | t0−r66 | + | t8−r88 | [T = 5]

【００３７】参照画素値の１行目の入力（５画素）が終
了すると、次に参照レジスタ４，５に第３行目のｒ２，
ｒ２４，ｒ４６，ｒ６８，ｒ９０が入力され、参照レジ
スタ４，５の出力が順次ＰＥ６，７にも入力される。Ｐ
Ｅ６ではｔ０，ｔ８との差分絶対値演算が行われ、ＰＥ
７ではｔ２，ｔ１０との差分絶対値演算が行われるとと
もに図１のデータ遅延器８で５クロック遅延した前段
（ＰＥ６）の差分絶対値演算結果との加算結果が出力さ
れる。When the input of the reference pixel value in the first row (5 pixels) is completed, the reference registers 4 and 5 store r2 and r2 in the third row.
r24, r46, r68, and r90 are input, and the outputs of the reference registers 4 and 5 are also input to the PEs 6 and 7 sequentially. P
At E6, a difference absolute value operation from t0 and t8 is performed, and PE
At 7, the difference absolute value calculation with t2 and t10 is performed, and the result of addition with the difference absolute value calculation result of the preceding stage (PE6) delayed by 5 clocks by the data delay unit 8 in FIG. 1 is output.

【００３８】従って、ＰＥ６からの出力データはｄ２＝｜ｔ０−ｒ２｜＋｜ｔ８−ｒ２４｜［Ｔ＝７］ｄ２４＝｜ｔ０−ｒ２４｜＋｜ｔ８−ｒ４６｜［Ｔ＝８］ｄ４６＝｜ｔ０−ｒ４６｜＋｜ｔ８−ｒ６８｜［Ｔ＝９］ｄ６８＝｜ｔ０−ｒ６８｜＋｜ｔ８−ｒ９０｜［Ｔ＝１０］となる。また、ＰＥ７からの出力データは、ｅ０＝ｄ０＋｜ｔ２−ｒ２｜＋｜ｔ１０−ｒ２４｜［Ｔ＝７］ｅ２２＝ｄ２２＋｜ｔ２−ｒ２４｜＋｜ｔ１０−ｒ４６｜［Ｔ＝８］ｅ４４＝ｄ４６＋｜ｔ２−ｒ４６｜＋｜ｔ１０−ｒ６８｜［Ｔ＝９］ｅ６６＝ｄ６８＋｜ｔ２−ｒ６８｜＋｜ｔ１０−ｒ９０｜［Ｔ＝１０］となる。Therefore, the output data from PE6 is d2 = | t0-r2 | + | t8-r24 | [T = 7] d24 = | t0-r24 | + | t8-r46 | [T = 8] d46 = | t0-r46 | + | t8-r68 | [T = 9] d68 = | t0-r68 | + | t8-r90 | [T = 10] The output data from PE7 is e0 = d0 + | t2-r2 | + | t10-r24 | [T = 7] e22 = d22 + | t2-r24 | + | t10-r46 | [T = 8] e44 = d46 + | t2-r46 | + | t10-r68 | [T = 9] e66 = d68 + | t2-r68 | + | t10-r90 | [T = 10]

【００３９】ｅ０は符号化ブロックについて、動きベク
トルが（−４，−４）の予測ブロック候補とのＡＥとな
り、ｅ２２は（−２，−４）、ｅ４４は（０，−４）、
ｅ６６は（２，−４）の予測ブロック候補とのＡＥとな
る。このようにして、ＰＥ６の結果が更にＰＥ７で累積
加算され、ＰＥ７から符号化ブロックと予測ブロック候
補とのＡＥが出力される。E0 is the AE of the coded block with the predicted block candidate whose motion vector is (-4, -4), e22 is (-2, -4), e44 is (0, -4),
e66 is the AE with the (2, -4) prediction block candidate. In this way, the result of PE6 is further cumulatively added by PE7, and AE of the coded block and the prediction block candidate is output from PE7.

【００４０】図４に示したパターンで参照画像値１〜２
５が入力された後は、基点を１画素分右にずらして参照
画素値を順次入力する（図２１（ｂ）ではｒ１１，ｒ３
３，…に相当）［Ｔ＝２５〜］。同様にして、さらに基
点をずらして参照画素値入力を継続し、時刻［Ｔ＝９
９］で１０画素×１０画素＝１００画素分の参照画素値
の入力が終了する。そして、予測ブロック候補のＡＥの
うち、最小のＡＥとなるブロックを予測ブロックとして
選択する。Referring to the pattern shown in FIG.
After 5 is input, the reference pixel value is sequentially input by shifting the base point to the right by one pixel (r11, r3 in FIG. 21B).
[T = 25-]. Similarly, input of the reference pixel value is continued by further shifting the base point, and the time [T = 9
9], the input of reference pixel values for 10 pixels × 10 pixels = 100 pixels ends. Then, among the AEs of the prediction block candidates, the block having the minimum AE is selected as the prediction block.

【００４１】以上の動作、即ち１符号化ブロックの画素
値入力と１００画素の参照画素入力とを行うための１０
０クロックの処理を繰り返し、順次符号化ブロックの予
測ブロックが選択される。このように、従来例と同等の
動きベクトル検出性能を備えながら、差分絶対値演算器
及びデータ遅延器の遅延量が少なく、回路規模が非常に
小さい動きベクトル検出装置を実現することができる。The above operation, that is, 10 operations for inputting the pixel value of one encoded block and inputting the reference pixel of 100 pixels is performed.
The process of 0 clock is repeated, and the prediction block of the encoding block is sequentially selected. As described above, it is possible to realize a motion vector detecting device having a very small circuit scale with a small delay amount of the differential absolute value calculator and the data delay unit while having the same motion vector detecting performance as the conventional example.

【００４２】＜第２の実施の形態＞図５は、本発明に係
る動きベクトル検出装置の第２の実施の形態を示すブロ
ック図である。本実施の形態は、図２６に示した従来例
と同様に隣接する２つの符号化ブロックの誤差演算を同
時に行う構成であるが、従来例が１２クロックの遅延器
を４つ備えていたのに対し、本実施の形態では３クロッ
クの遅延器を２つ備えるだけである点に特徴がある。<Second Embodiment> FIG. 5 is a block diagram showing a second embodiment of the motion vector detecting device according to the present invention. The present embodiment has a configuration in which the error calculation of two adjacent coding blocks is performed at the same time as in the conventional example shown in FIG. 26, but the conventional example has four delay units of 12 clocks. On the other hand, the present embodiment is characterized in that only two delay units of three clocks are provided.

【００４３】この動きベクトル検出装置は、符号化メモ
リ１、参照メモリ２、符号化メモリ１及び参照メモリ２
の入出力を制御するメモリコントローラ３１、参照レジ
スタ４，５、データ遅延器３４，３５を介して直列に接
続したＰＥ６，７及び３２，３３からなる。ＰＥ６，
７，３２，３３は図２に示したＰＥと同一の構成であ
る。図６は図５に示した実施の形態における参照レジス
タ４，５及びＰＥ６，７，３２，３３の動きを示すタイ
ミングチャートであり、図６、図７及び図８を用いて動
作を説明する。なお、図６において、画素の記号は図８
に示す記号を用いる。This motion vector detecting device includes an encoding memory 1, a reference memory 2, an encoding memory 1 and a reference memory 2.
, PEs 6, 7 and 32, 33 connected in series via data registers 34, 35 and reference registers 4, 5, respectively. PE6
7, 32, and 33 have the same configuration as the PE shown in FIG. FIG. 6 is a timing chart showing the operation of the reference registers 4, 5 and PEs 6, 7, 32, 33 in the embodiment shown in FIG. 5, and the operation will be described with reference to FIGS. 6, 7, and 8. In FIG. 6, the symbols of the pixels are shown in FIG.
Use the symbol shown in.

【００４４】まず、図７及び図８を用いて符号化画素値
及び参照画素値の入力順を説明する。符号化画素値は一
旦図５の符号化メモリ１に格納された後、ブロック単位
に横方向に入力される。例えば、図８の符号化ブロック
０の場合、ｔ０，ｔ８，ｔ２，ｔ１０の順で入力され
る。そして、ＰＥ６，７の符号化レジスタには符号化ブ
ロック０が格納され、ＰＥ３２，３３の符号化レジスタ
には隣接する符号化ブロック１が格納され、ＰＥ内には
常に２符号化ブロックが保持される。First, the input order of the coded pixel value and the reference pixel value will be described with reference to FIGS. The encoded pixel values are temporarily stored in the encoding memory 1 in FIG. 5 and then input in the horizontal direction in block units. For example, in the case of the coding block 0 in FIG. 8, the input is performed in the order of t0, t8, t2, and t10. The coding registers of PEs 6 and 7 store the coding block 0, the coding registers of PEs 32 and 33 store the adjacent coding block 1, and the PE always holds two coding blocks. You.

【００４５】一方、参照画素値は図７に示すパターンで
６列単位で参照レジスタ４，５に入力される。図７は図
２５で示した符号化画素のサブサンプリングパターンに
基づくパターンであり、まず１〜１５の参照画素値が番
号順に入力され、次に基点を１画素ずらして１'とした
１'〜１５'が入力され、順に１''，１'''と基点からず
らして参照画素値が入力される。ＰＥ内に図７の符号化
ブロックａ，ｂが保持されている期間に図７の参照画素
読み出し範囲で示した１〜１５'''の６列、６０画素分
が参照メモリ２から入力され、符号化ブロックａの水平方向０〜＋３，垂直方向 −４〜＋３符号化ブロックｂの水平方向 −４〜−１，垂直方向 −４〜＋３の範囲の予測ブロック候補に対応する参照画素値が入力
されることになる。On the other hand, the reference pixel values are inputted to the reference registers 4 and 5 in the unit of six columns in the pattern shown in FIG. FIG. 7 shows a pattern based on the sub-sampling pattern of the coded pixels shown in FIG. 25. First, reference pixel values 1 to 15 are input in numerical order, and then the base point is shifted by one pixel to 1 ′ to 1 ′. 15 ′ is input, and reference pixel values are input sequentially shifted from the base point to 1 ″, 1 ′ ″. During the period in which the coding blocks a and b in FIG. 7 are held in the PE, six columns and 60 pixels of 1 to 15 ″ ′ indicated by the reference pixel reading range in FIG. A reference pixel value corresponding to a prediction block candidate in the range of -4 to -1 in the horizontal direction and -4 to -3 in the vertical direction of the coded block b is input. Will be done.

【００４６】そして、１つの符号化ブロックについてみ
れば、全予測ブロック候補に対応する参照画素値は６列
分の参照画素値２回で入力され、それぞれ動きベクトル
の範囲が１回目の参照画素値入力水平方向 −４〜−１，垂直方向 −４〜＋３２回目の参照画素値入力水平方向０〜＋３，垂直方向 −４〜＋３の予測ブロック候補に対応する参照画素値が入力され
る。ここでも、誤差演算はサブサンプリングした符号化
画素と行うため、参照画素読み出し範囲は探索範囲より
一回り小さい範囲となる。For one coded block, the reference pixel values corresponding to all the prediction block candidates are input twice as the reference pixel values for six columns, and the range of the motion vector is the first reference pixel value. Input Horizontal direction −4 to −1, vertical direction −4 to +3 Second reference pixel value input Reference pixel values corresponding to prediction block candidates in horizontal direction 0 to +3 and vertical direction −4 to +3 are input. Also in this case, since the error calculation is performed on the coded pixels subjected to the sub-sampling, the reference pixel read range is a range slightly smaller than the search range.

【００４７】６列分の参照画素値の入力終了後は、右隣
の符号化ブロックを入力するとともに、４列右にずらし
た位置から６列分の参照画素値を入力する。この様子を
図８に示す。まずｒ０〜ｒ６４を図７で示したようにｒ
０，ｒ２２，ｒ４４，…の順に入力し（サブサンプリン
グされているため最下段の参照画素値は入力する必要は
ない）、次にｒ０の４列左のｒ４４〜ｒ１０８をｒ４
４，ｒ６６，ｒ８８，…の順に入力する。After the input of the reference pixel values for the six columns is completed, the coding block on the right side is input, and the reference pixel values for the six columns are input from the position shifted to the right by four columns. This is shown in FIG. First, r0 to r64 are changed to r as shown in FIG.
0, r22, r44,... (It is not necessary to input the reference pixel value at the lowermost stage because sub-sampling is performed).
, R66, r88,...

【００４８】図６において、符号化メモリ１からｔ０，
ｔ８，ｔ２，ｔ１０の順で最初の符号化ブロックが入力
され、ｔ０，ｔ８はＰＥ６内の符号化レジスタに格納さ
れ、ｔ２，ｔ１０はＰＥ７内の符号化レジスタに格納さ
れる［時刻Ｔ＝０〜３］。次にｔ１６，ｔ２４，ｔ１
８，ｔ２６の順で次の符号化ブロックが入力され、ｔ１
６，ｔ２４はＰＥ３２内の符号化レジスタに格納され、
ｔ１８，ｔ２６はＰＥ３３内の符号化レジスタに格納さ
れ［Ｔ＝４〜７］、ＰＥ６，７，３２，３３内の符号化
レジスタに計２ブロック分の符号化画素値が格納され
る。この後、符号化画素値は１ブロック単位で更新さ
れ、ＰＥ内には常に２ブロック分の符号化画素値が保持
される。In FIG. 6, the encoding memories 1 to t0, t0,
The first coding block is input in the order of t8, t2, and t10, t0 and t8 are stored in the coding register in PE6, and t2 and t10 are stored in the coding register in PE7 [time T = 0 ~ 3]. Next, t16, t24, t1
The next coded block is input in the order of 8, t26, and t1
6, t24 are stored in the encoding register in PE32,
t18 and t26 are stored in the encoding register in the PE 33 [T = 4 to 7], and the encoding registers in the PEs 6, 7, 32, and 33 store the encoded pixel values for a total of two blocks. Thereafter, the coded pixel values are updated in units of one block, and the coded pixel values of two blocks are always held in the PE.

【００４９】ｔ０，ｔ８，ｔ２，ｔ１０の符号化ブロッ
ク画素の入力後、参照メモリ２から参照レジスタ４，５
に参照画素値の第１行目であるｒ０，ｒ２２，ｒ４４が
入力され［Ｔ＝４〜６］、参照レジスタ４，５の出力が
ＰＥに入力される。ＰＥ６では時刻Ｔ６から参照レジス
タ４，５の出力と符号化画素値ｔ０，ｔ８との差分絶対
値演算が行われる。よって、ＰＥ６の出力データは、ｄ０＝｜ｔ０−ｒ０｜＋｜ｔ８−ｒ２２｜［Ｔ＝６］ｄ２２＝｜ｔ０−ｒ２２｜＋｜ｔ８−ｒ４４｜［Ｔ＝７］となる。また、ＰＥ３２では符号化画素値ｔ１６，２４
との差分絶対値演算が行われ、ｇ０＝｜ｔ１６−ｒ０｜＋｜ｔ２４−ｒ２２｜［Ｔ＝６］ｇ２２＝｜ｔ１６−ｒ２２｜＋｜ｔ２４−ｒ４４｜［Ｔ＝７］が出力される。After inputting the coding block pixels at t0, t8, t2, and t10, the reference registers 4 and 5
, The first row of reference pixel values r0, r22, and r44 are input [T = 4 to 6], and the outputs of the reference registers 4 and 5 are input to the PE. At time PE6, from time T6, the absolute difference between the outputs of the reference registers 4 and 5 and the coded pixel values t0 and t8 is calculated. Therefore, the output data of PE6 is as follows: d0 = | t0−r0 | + | t8−r22 | [T = 6] d22 = | t0−r22 | + | t8−r44 | [T = 7] In the PE 32, the coded pixel values t16, 24
Is calculated, and g0 = | t16−r0 | + | t24−r22 | [T = 6] g22 = | t16−r22 | + | t24−r44 | [T = 7] is output. .

【００５０】参照画素値の１行目の入力（３画素）が終
了すると、続いて参照レジスタ４，５に第３行目のｒ
２，ｒ２４，ｒ４６が入力され、参照レジスタ４，５の
出力が順次ＰＥにも入力される。ＰＥ６では引続きｔ
０，ｔ８との差分絶対値演算が行われ、ＰＥ７ではｔ
２，ｔ１０との差分絶対値演算が行われるとともにデー
タ遅延器３４で３クロック遅延した前段（ＰＥ６）の差
分絶対値演算結果との加算結果が出力される。When the input of the reference pixel value in the first row (three pixels) is completed, the reference registers 4 and 5 store r in the third row.
2, r24 and r46 are input, and the outputs of the reference registers 4 and 5 are sequentially input to the PE. PE6 continues t
The absolute value of the difference between 0 and t8 is calculated.
The difference absolute value calculation with respect to 2 and t10 is performed, and the result of addition with the difference absolute value calculation result of the preceding stage (PE6) delayed by 3 clocks by the data delay unit 34 is output.

【００５１】従って、ＰＥ６からの出力データはｄ２＝｜ｔ０−ｒ２｜＋｜ｔ８−ｒ２４｜［Ｔ＝９］ｄ２４＝｜ｔ０−ｒ２４｜＋｜ｔ８−ｒ４６｜［Ｔ＝１０］となり、ＰＥ７からの出力データは、ｅ０＝ｄ０＋｜ｔ２−ｒ２｜＋｜ｔ１０−ｒ２４｜［Ｔ＝９］ｅ２２＝ｄ２２＋｜ｔ２−ｒ２４｜＋｜ｔ１０−ｒ４６｜［Ｔ＝１０］となる。参照画素値はＰＥ３２，３３にも入力され、Ｐ
Ｅ３２の出力データはｇ２＝｜ｔ１６−ｒ２｜＋｜ｔ２４−ｒ２４｜［Ｔ＝９］ｇ２４＝｜ｔ１６−ｒ２４｜＋｜ｔ２４−ｒ４６｜［Ｔ＝１０］となり、ＰＥ３３からの出力データは、ｈ０＝ｄ０＋｜ｔ１８−ｒ２｜＋｜ｔ２６−ｒ２４｜［Ｔ＝９］ｈ２２＝ｄ２２＋｜ｔ１８−ｒ２４｜＋｜ｔ２６−ｒ４６｜［Ｔ＝１０］となる。Therefore, the output data from PE6 becomes d2 = | t0-r2 | + | t8-r24 | [T = 9] d24 = | t0-r24 | + | t8-r46 | [T = 10], and PE7 The output data from is as follows: e0 = d0 + | t2-r2 | + | t10-r24 | [T = 9] e22 = d22 + | t2-r24 | + | t10-r46 | [T = 10] The reference pixel value is also input to PEs 32 and 33, and P
The output data of E32 is g2 = | t16−r2 | + | t24−r24 | [T = 9] g24 = | t16−r24 | + | t24−r46 | [T = 10] The output data from the PE33 is h0 = d0 + | t18-r2 | + | t26-r24 | [T = 9] h22 = d22 + | t18-r24 | + | t26-r46 | [T = 10]

【００５２】ｅ０は符号化ブロックについて、動きベク
トルが（０，−４）の予測ブロック候補とのＡＥ、ｅ２
２は（２，−４）の予測ブロック候補とのＡＥとなる。
また、ｈ０は符号化ブロック１について、動きベクトル
が（−４，−４）の予測ブロック候補とのＡＥ、ｈ２２
は（−２，−４）の予測ブロック候補とのＡＥとなる。
このようにして、ＰＥ６の結果が更にＰＥ７で累積加算
され、ＰＥ３２の結果が更にＰＥ３３で累積加算され、
ＰＥ７，３３からそれぞれ符号化ブロック０，１と予測
ブロック候補とのＡＥが出力される。E0 is the AE of the coded block with the predicted block candidate whose motion vector is (0, -4), e2
2 is the AE with the predicted block candidate of (2, -4).
H0 is the AE of the coded block 1 with the predicted block candidate whose motion vector is (−4, −4), h22
Is the AE with the (-2, -4) prediction block candidate.
In this way, the result of PE6 is further cumulatively added by PE7, the result of PE32 is further cumulatively added by PE33,
The PEs 7 and 33 output AEs of the encoded blocks 0 and 1 and the prediction block candidates, respectively.

【００５３】図７に示したパターンで参照画像値１〜１
５が入力された後は、基点を１画素分右にずらした１'
として参照画素値１'〜１５'を順次入力する（図８では
ｒ１１，ｒ３３，…に相当）［Ｔ＝１９〜］。同様にし
て、さらに基点をずらして参照画素値入力を継続し、時
刻［Ｔ＝６３］で６列分、６０画素の参照画素値の入力
が終了する。そして、１つのブロックについて６列分の
参照画素値入力２回で全探索範囲の予測ブロック候補と
の誤差演算が終了し、予測ブロック候補のＡＥのうち、
最小のＡＥとなるブロックを予測ブロックとして選択す
る。以上の動作を６０クロック毎に繰り返すことで、順
次符号化ブロックの予測ブロックが選択される。Reference image values 1 to 1 in the pattern shown in FIG.
After 5 is input, the base point is shifted 1 pixel to the right by 1 '.
, Sequentially input reference pixel values 1 ′ to 15 ′ (corresponding to r11, r33,... In FIG. 8) [T = 19 to]. Similarly, input of reference pixel values is continued by further shifting the base point, and input of reference pixel values of 60 pixels for six columns at time [T = 63] ends. The error calculation for the prediction block candidates in the entire search range is completed by inputting the reference pixel values for six columns for one block twice, and among the AEs of the prediction block candidates,
The block having the minimum AE is selected as the prediction block. By repeating the above operation every 60 clocks, a prediction block of an encoded block is sequentially selected.

【００５４】このように、データ遅延器を介して直列に
接続された複数の演算ユニットを２組備え、前記符号化
メモリから１符号化ブロックの前記代表画素を一方の演
算ユニットに入力し、２組の演算ユニット内の符号化レ
ジスタには新たに入力された代表画素と既に入力されて
いる代表画素の隣接する２つの符号化ブロックの代表画
素を保持し、該隣接する２つの符号化ブロックの誤差演
算を同時に行う。そして、１回の参照画素入力で該隣接
する２つの符号化ブロックのうち、左側の符号化ブロッ
クは探索範囲内の右側の予測ブロック候補群との誤差演
算を行い、右側の符号化ブロックは探索範囲内の左側の
予測ブロック候補群との誤差演算を行うように前記参照
メモリから参照画素を入力する。２回の参照画素入力で
全探索範囲の予測ブロック候補との誤差演算を行うこと
で、従来例と同等の動きベクトル検出性能を備えなが
ら、データ遅延器の遅延量が少なく、回路規模が非常に
小さい動きベクトル検出装置を実現することができる。As described above, two sets of a plurality of operation units connected in series via a data delay unit are provided, and the representative pixel of one coded block is input from the coding memory to one of the operation units. The encoding register in the set of operation units holds the representative pixel of the two adjacent encoding blocks of the newly input representative pixel and the already input representative pixel, and stores the representative pixels of the two adjacent encoding blocks. Perform error calculation simultaneously. With one reference pixel input, of the two adjacent encoded blocks, the left encoded block performs an error operation with the right predicted block candidate group within the search range, and the right encoded block is searched. A reference pixel is input from the reference memory so as to perform an error operation with respect to the left prediction block candidate group in the range. By performing the error calculation with the prediction block candidates in the entire search range by two reference pixel inputs, the delay amount of the data delay unit is small and the circuit scale is very large while having the same motion vector detection performance as the conventional example. A small motion vector detecting device can be realized.

【００５５】本実施の形態は、第１の実施の形態に比べ
てＰＥ及びデータ遅延器の数が２倍になっているが、デ
ータ遅延器の遅延量は第１の実施の形態が５クロックで
あったのに対して本実施の形態では３クロックと少な
く、また第１の実施の形態では１符号化ブロックの誤差
演算に１００クロック要していたが、本実施の形態では
６０クロック毎に１符号化ブロックの誤差演算が終了す
るので、動作周波数をより低くすることができる。In the present embodiment, the number of PEs and data delays is doubled as compared with the first embodiment, but the delay amount of the data delay is five clocks in the first embodiment. On the other hand, in the present embodiment, the number of clocks is as small as 3 clocks, and in the first embodiment, 100 clocks are required for the error calculation of one encoded block. Since the error calculation for one encoded block is completed, the operating frequency can be further reduced.

【００５６】＜第３の実施の形態＞図９は、本発明に係
る動きベクトル検出装置の第３の実施の形態を示すブロ
ック図である。第２の本実施の形態は、データ遅延器を
介して直列に接続されたＰＥを２組備え、隣接する２つ
の符号化ブロックのマッチング演算を並列処理していた
のに対し、本実施の形態では、１つおきの符号化ブロッ
クを並列処理する。そして、第２の実施の形態では動き
ベクトルの探索範囲が水平方向に−４〜＋３であったの
に対し、本実施の形態ではデータ遅延器を３クロック遅
延から５クロック遅延とし、１符号化ブロックあたりの
参照画素読み出し範囲を６列から１０列に変更すること
で、第２の実施の形態の２倍の−８〜＋７の広い範囲を
探索可能であることに特徴がある。<Third Embodiment> FIG. 9 is a block diagram showing a third embodiment of the motion vector detecting device according to the present invention. In the second embodiment, two sets of PEs connected in series via a data delay unit are provided, and the matching operation of two adjacent encoded blocks is performed in parallel. Then, every other encoded block is processed in parallel. In the second embodiment, the search range of the motion vector is −4 to +3 in the horizontal direction. On the other hand, in the present embodiment, the data delay unit is changed from 3 clock delays to 5 clock delays, and one encoding is performed. By changing the reference pixel readout range per block from 6 columns to 10 columns, a characteristic feature is that a wide range from −8 to +7, which is twice that of the second embodiment, can be searched.

【００５７】この動きベクトル検出装置は図５に示した
第２の実施の形態の動きベクトル検出装置と比較して、
メモリコントローラ４１の制御及びデータ遅延器４２，
４３の遅延量が異なるだけで、その他は同一の動作をす
る。図１０に本実施の形態における参照レジスタ４，５
及びＰＥ６，７，３２，３３の動作をタイミングチャー
トで示す。これは動きベクトルの探索範囲が垂直方向に
−４〜＋３、水平方向に−８〜＋７の場合の例である。
本実施の形態における参照画素の読み出しは図４で示し
た第１の実施の形態における読み出しと同一であるが、
符号化画素の読み出しは異なる。This motion vector detecting device is different from the motion vector detecting device of the second embodiment shown in FIG.
The control of the memory controller 41 and the data delay unit 42,
The other operations are the same except for the amount of delay. FIG. 10 shows reference registers 4 and 5 in the present embodiment.
And the operations of PEs 6, 7, 32, and 33 are shown in a timing chart. This is an example where the motion vector search range is -4 to +3 in the vertical direction and -8 to +7 in the horizontal direction.
The readout of the reference pixels in the present embodiment is the same as the readout in the first embodiment shown in FIG.
The reading of the coded pixels is different.

【００５８】図１１に参照画素と符号化画素の関係を示
す。同図は図７と比較して、符号化ブロックの位置が異
なる。まず符号化ブロックｃ，ｄの位置の符号化画素値
が図９のＰＥ６，７，３２，３３に入力され、同時に参
照画素値が図１１の番号順に１〜２５，１'〜２５'，
１''〜２５''，１'''〜２５'''の順で読み出される。そ
して、図１１の参照画素読み出し範囲で示した１０列分
の参照画素値の入力で、符号化ブロックｃの水平方向０〜＋７，垂直方向 −４〜＋３符号化ブロックｄの水平方向 −８〜−１，垂直方向 −４〜＋３の範囲の予測ブロック候補に対応する参照画素値が入力
される。ここでも、誤差演算はサブサンプリングした符
号化画素値と行うため、参照画素読み出し範囲は探索範
囲より一回り小さい範囲となる。FIG. 11 shows the relationship between reference pixels and coded pixels. This figure differs from FIG. 7 in the position of the coding block. First, the coded pixel values at the positions of the coded blocks c and d are input to PEs 6, 7, 32, and 33 in FIG. 9, and at the same time, the reference pixel values are 1 to 25, 1 'to 25',
The data is read out in the order of 1 ″ to 25 ″, 1 ′ ″ to 25 ′ ″. Then, by inputting the reference pixel values for 10 columns shown in the reference pixel readout range in FIG. 11, the horizontal direction of the coding block c is 0 to +7, the vertical direction is -4 to +3, and the horizontal direction of the coding block d is -8 to A reference pixel value corresponding to a prediction block candidate in the range of −1, −4 to +3 in the vertical direction is input. Also in this case, since the error calculation is performed on the coded pixel values subjected to the sub-sampling, the reference pixel readout range is a range slightly smaller than the search range.

【００５９】参照画素の読み出しと符号化ブロックの関
係を図１２に示す。まず符号化ブロック０，２の符号化
画素値がＰＥに入力されると同時に、ｒ０〜ｒ１０８の
参照画素値が図１１で示した順に入力され（図１２では
ｒ０，ｒ２２，ｒ４４，…の順に相当）ＡＥが求められ
る。次に符号化ブロック１，３の符号化画素値がＰＥに
入力されると同時に、ｒ０から基点を左に４列ずらした
ｒ４４〜ｒ１５２の参照画素値がｒ４４，ｒ６６，ｒ８
８，…の順で入力され、ＡＥが求められる。次に符号化
ブロック２，４の符号化画素値がＰＥに入力されると同
時に、ｒ４４から基点を左に４列ずらしたｒ８８〜ｒ１
９６の参照画素値がｒ８８，ｒ１１０，ｒ１３２，…の
順に入力され、ＡＥが求められる。FIG. 12 shows the relationship between the reading of the reference pixel and the coding block. First, at the same time as the coded pixel values of the coded blocks 0 and 2 are input to the PE, the reference pixel values of r0 to r108 are input in the order shown in FIG. 11 (in FIG. 12, the order is r0, r22, r44,...). Equivalent) AE is required. Next, at the same time as the coded pixel values of the coded blocks 1 and 3 are input to the PE, the reference pixel values of r44 to r152 whose base points are shifted to the left by four columns from r0 are r44, r66, and r8.
Are input in the order of 8,... And AE is obtained. Next, at the same time as the coded pixel values of the coded blocks 2 and 4 are input to the PE, r88 to r1 with the base point shifted four columns to the left from r44.
96 are input in the order of r88, r110, r132,..., And AE is obtained.

【００６０】この結果、１回目の参照画素値入力：符号化ブロック０の水平方向０〜＋７，垂直方向 −４〜＋３符号化ブロック２の水平方向 −８〜−１，垂直方向 −４〜＋３２回目の参照画素値入力：符号化ブロック１の水平方向０〜＋７，垂直方向 −４〜＋３符号化ブロック３の水平方向 −８〜−１，垂直方向 −４〜＋３３回目の参照画素値入力：符号化ブロック２の水平方向０〜＋７，垂直方向 −４〜＋３符号化ブロック４の水平方向 −８〜−１，垂直方向 −４〜＋３の動きベクトル探索範囲の予測ブロック候補に対応する
参照画素が入力される。符号化ブロック２についてみる
と、１回目の参照画素値入力水平方向 −８〜−１，垂直方向 −４〜＋３３回目の参照画素値入力水平方向０〜＋７，垂直方向 −４〜＋３の２回の参照画素値入力で全探索範囲の予測ブロック候
補とのＡＥが計算されることになる。As a result, the first reference pixel value input: horizontal direction 0 to +7, vertical direction -4 to +3 of coding block 0 horizontal direction -8 to -1 and vertical direction -4 to +3 of coding block 2 Second reference pixel value input: horizontal direction 0 to +7, vertical direction -4 to +3 of encoding block 1 horizontal direction -8 to -1, vertical direction -4 to +3 of encoding block 3 Third reference pixel value Input: corresponding to a prediction block candidate in the motion vector search range of the coding block 2 in the horizontal direction 0 to +7, the vertical direction -4 to +3, and the coding block 4 in the horizontal direction -8 to -1 and the vertical direction -4 to +3. A reference pixel is input. Regarding the encoding block 2, the first reference pixel value input is -8 to -1 in the horizontal direction, -4 to +3 in the vertical direction. The third reference pixel value input is 2 in the horizontal direction 0 to +7 and the vertical direction is -4 to +3. The AE with the prediction block candidates in the entire search range is calculated by inputting the reference pixel value twice.

【００６１】図１０のタイムチャートで、まず符号化ブ
ロック０，２の符号化画素値が入力され、符号化ブロッ
ク０の符号化画素値はＰＥ６，７内の符号化レジスタに
格納され、符号化ブロック２の符号化画素値はＰＥ３
２，３３内の符号化レジスタに格納される［Ｔ＝０〜
７］。符号化画素値の入力と並行して［Ｔ＝２］から参
照画素値の入力が開始される。［Ｔ＝２〜６］で第１行
目（５画素）の参照画素値ｒ０，ｒ２２，ｒ４４，ｒ６
６，ｒ８８が入力される。In the time chart of FIG. 10, first, the coded pixel values of the coded blocks 0 and 2 are input, and the coded pixel values of the coded block 0 are stored in the coding registers in the PEs 6 and 7. The encoded pixel value of block 2 is PE3
[T = 0 to 0] stored in the encoding register in
7]. The input of the reference pixel value is started from [T = 2] in parallel with the input of the encoded pixel value. In [T = 2 to 6], reference pixel values r0, r22, r44, r6 of the first row (5 pixels)
6, r88 is input.

【００６２】ＰＥ６の出力はｄ０＝｜ｔ０−ｒ０｜＋｜ｔ８−ｒ２２｜［Ｔ＝４］ｄ２２＝｜ｔ０−ｒ２２｜＋｜ｔ８−ｒ４４｜［Ｔ＝５］ｄ４４＝｜ｔ０−ｒ４４｜＋｜ｔ８−ｒ６６｜［Ｔ＝６］ｄ６６＝｜ｔ０−ｒ６６｜＋｜ｔ８−ｒ８８｜［Ｔ＝７］となり、ＰＥ３２の出力はｇ０＝｜ｔ３２−ｒ０｜＋｜ｔ４０−ｒ２２｜［Ｔ＝４］ｇ２２＝｜ｔ３２−ｒ２２｜＋｜ｔ４０−ｒ４４｜［Ｔ＝５］ｇ４４＝｜ｔ３２−ｒ４４｜＋｜ｔ４０−ｒ６６｜［Ｔ＝６］ｇ６６＝｜ｔ３２−ｒ６６｜＋｜ｔ４０−ｒ８８｜［Ｔ＝７］となる。The output of PE6 is d0 = | t0-r0 | + | t8-r22 | [T = 4] d22 = | t0-r22 | + | t8-r44 | [T = 5] d44 = | t0-r44 | [T = 6] d66 = | t0-r66 | + | t8-r88 | [T = 7], and the output of PE32 is g0 = | t32-r0 | + | t40-r22 | [T = 4] g22 = | t32−r22 | + | t40−r44 | [T = 5] g44 = | t32−r44 | + | t40−r66 | [T = 6] g66 = | t32−r66 | + | t40− r88 | [T = 7].

【００６３】さらに、時刻［Ｔ＝７〜１１］で第３行目
の参照画素値ｒ２，ｒ２４，ｒ４６，ｒ６８，ｒ９０が
入力され、ＰＥ７では参照画素値と符号化画素値の差分
絶対値演算が行われると同時に、データ遅延器４２でク
ロック遅延した前段（ＰＥ６）の出力との和が求められ
る。よってＰＥ７の出力は、ｅ０＝ｄ０＋｜ｔ２−ｒ２｜＋｜ｔ１０−ｒ２４｜［Ｔ＝９］ｅ２２＝ｄ２２＋｜ｔ２−ｒ２４｜＋｜ｔ１０−ｒ４６｜［Ｔ＝１０］ｅ４４＝ｄ４４＋｜ｔ２−ｒ４６｜＋｜ｔ１０−ｒ６８｜［Ｔ＝１１］ｅ６６＝ｄ６６＋｜ｔ２−ｒ６８｜＋｜ｔ１０−ｒ９０｜［Ｔ＝１２］となる。Further, at time [T = 7 to 11], the reference pixel values r2, r24, r46, r68, and r90 in the third row are input, and the PE 7 calculates the absolute difference between the reference pixel value and the coded pixel value. Is performed, the sum with the output of the preceding stage (PE6) delayed by the clock by the data delay unit 42 is obtained. Therefore, the output of PE7 is e0 = d0 + | t2-r2 | + | t10-r24 | [T = 9] e22 = d22 + | t2-r24 | + | t10-r46 | [T = 10] e44 = d44 + | t2 -R46 | + | t10-r68 | [T = 11] e66 = d66 + | t2-r68 | + | t10-r90 | [T = 12]

【００６４】ＰＥ３３でも参照画素値と符号化画素値の
差分絶対値演算が行われると同時に、データ遅延器４３
で５クロック遅延した前段（ＰＥ３２）の出力との和が
求められる。よってＰＥ３３の出力は、ｈ０＝ｇ０＋｜ｔ３４−ｒ２｜＋｜ｔ４２−ｒ２４｜［Ｔ＝９］ｈ２２＝ｇ２２＋｜ｔ３４−ｒ２４｜＋｜ｔ４２−ｒ４６｜［Ｔ＝１０］ｈ４４＝ｇ４４＋｜ｔ３４−ｒ４６｜＋｜ｔ４２−ｒ６８｜［Ｔ＝１１］ｈ６６＝ｇ６６＋｜ｔ３４−ｒ６８｜＋｜ｔ４２−ｒ９０｜［Ｔ＝１２］となる。In the PE 33, the absolute value of the difference between the reference pixel value and the coded pixel value is calculated, and at the same time, the data delay unit 43
And the sum with the output of the preceding stage (PE32) delayed by 5 clocks is obtained. Therefore, the output of PE33 is h0 = g0 + | t34-r2 | + | t42-r24 | [T = 9] h22 = g22 + | t34-r24 | + | t42-r46 | [T = 10] h44 = g44 + | t34 −r46 | + | t42−r68 | [T = 11] h66 = g66 + | t34−r68 | + | t42−r90 | [T = 12]

【００６５】ｅ０，ｅ２２，ｅ４４，ｅ６６は、それぞ
れ符号化ブロック０について、動きベクトルが（０，−
４），（２，−４），（４，−４），（６，−４）の予
測ブロック候補とのＡＥとなる。ｈ０，ｈ２２，ｈ４
４，ｈ６６は、それぞれ符号化ブロック２について、動
きベクトルが（−８，−４），（−６，−４），（−
４，−４），（−２，−４）の予測ブロック候補とのＡ
Ｅとなる。このようにして、ＰＥ６の結果が更にＰＥ７
で累積加算され、またＰＥ３２の結果が更にＰＥ３３で
累積加算され、ＰＥ７からは符号化ブロック０と予測ブ
ロック候補とのＡＥが出力され、ＰＥ３３からは符号化
ブロック２と予測ブロック候補とのＡＥが出力される。E0, e22, e44 and e66 indicate that the motion vector of the coded block 0 is (0,-
4), AE with prediction block candidates (2, -4), (4, -4), (6, -4). h0, h22, h4
4 and h66 indicate that the motion vector is (-8, -4), (-6, -4), (-
A with the predicted block candidates of (4, -4), (-2, -4)
E. In this way, the result of PE6 is further
And the result of PE32 is further cumulatively added by PE33. PE7 outputs AE of coding block 0 and the prediction block candidate, and PE33 outputs AE of coding block 2 and the prediction block candidate. Is output.

【００６６】参照画素は［Ｔ＝２〜１０１］の間に最初
の１０列分、１００画素が入力され、続いて［Ｔ＝１０
２〜２０１］で次の１０列分、１００画素が入力され、
以後この動作を繰り返す。符号化画素は、［Ｔ＝１００
〜１０７］で符号化ブロック１，３の符号化画素を入力
し、以後１００クロック毎に符号化ブロックをずらしな
がら入力を繰り返す。As the reference pixels, 100 pixels for the first 10 columns are input during [T = 2 to 101], and then [T = 10
2-201], 100 pixels for the next 10 columns are input,
Thereafter, this operation is repeated. The encoded pixel is [T = 100
To 107], the coded pixels of the coded blocks 1 and 3 are input, and thereafter, the input is repeated while shifting the coded blocks every 100 clocks.

【００６７】このように、データ遅延器を介して直列に
接続された複数の演算ユニットを２組備え、前記符号化
メモリから１つおきの符号化ブロックの代表画素を演算
ユニットに入力し、前記演算ユニットは１つおきの符号
化ブロックの誤差演算を並列して行うことで、第２の実
施の形態に比較して回路規模が同等ながら、より広い範
囲の動きベクトル探索が可能な動きベクトル検出装置を
実現することができる。すなわち、１回の参照画素入力
で該１つおきの符号化ブロックのうち、左側の符号化ブ
ロックは探索範囲内の右側の予測ブロック候補群との誤
差演算を行い、右側の符号化ブロックは探索範囲内の左
側の予測ブロック候補群との誤差演算を行うように前記
参照メモリから参照画素を入力する。そして、２回の参
照画素入力で全探索範囲の予測ブロック候補との誤差演
算を行う。こうして、回路規模がほぼ同じの場合、参照
画素の読み出し量を約１．７倍とするだけで動きベクト
ルの探索範囲を２倍とすることができ、広い範囲を探索
する場合に一層動作周波数を低くすることができ、低消
費電力化がはかれる。また、本実施の形態と第１の実施
の形態を比較すると、本実施の形態では探索範囲が２倍
で、回路規模も２倍となっているが、参照画素の読み出
し量は同じであり、同じ探索範囲当たりの参照画素読み
出し量は半分となる長所がある。As described above, two sets of a plurality of arithmetic units connected in series via a data delay unit are provided, and the representative pixels of every other coded block are input from the coding memory to the arithmetic units. The operation unit performs the error calculation of every other encoding block in parallel, so that the motion vector detection enables a wider range of motion vector search with the same circuit scale as in the second embodiment. The device can be realized. That is, with one reference pixel input, of the other coding blocks, the left coding block performs an error calculation with the right prediction block candidate group within the search range, and the right coding block searches the right coding block. A reference pixel is input from the reference memory so as to perform an error operation with respect to the left prediction block candidate group in the range. Then, an error calculation with respect to the prediction block candidates in the entire search range is performed by two reference pixel inputs. In this way, when the circuit scale is almost the same, the search range of the motion vector can be doubled only by increasing the readout amount of the reference pixel by about 1.7 times, and the operating frequency is further increased when searching a wide range. Power consumption can be reduced. Also, comparing this embodiment with the first embodiment, the search range is twice as large and the circuit size is double in this embodiment, but the readout amount of the reference pixels is the same, There is an advantage that the reference pixel read amount per the same search range is halved.

【００６８】＜第４の実施の形態＞第４の実施の形態
は、第１〜３の実施の形態において、探索範囲を変化さ
せた場合である。動きベクトルの探索範囲は参照画素の
読み出し範囲に依存する。例えば第１の実施の形態の場
合、図４で示した通り参照画素の読み出し範囲が水平方
向１０画素、垂直方向１０画素であるが、この範囲を変
化させることで動きベクトルの探索範囲も変化する。一
例として、第１の実施の形態において、動きベクトルの
探索範囲を水平、垂直ともに２画素増加させて、水平方
向に−５〜＋４、垂直方向に−５〜＋４に拡張した場合
を考える。この場合、参照画素の読み出し範囲も水平、
垂直共に２画素増加させて、水平方向１２画素、垂直方
向１２画素とすることで実現できる。<Fourth Embodiment> The fourth embodiment is a case where the search range is changed in the first to third embodiments. The search range of the motion vector depends on the read range of the reference pixel. For example, in the case of the first embodiment, the readout range of the reference pixels is 10 pixels in the horizontal direction and 10 pixels in the vertical direction as shown in FIG. 4, but by changing this range, the search range of the motion vector also changes. . As an example, consider a case in the first embodiment in which the search range of the motion vector is increased by two pixels in both the horizontal and vertical directions, and expanded to -5 to +4 in the horizontal direction and to -5 to +4 in the vertical direction. In this case, the readout range of the reference pixel is also horizontal,
This can be realized by increasing the number of pixels by 2 in both the vertical and 12 pixels in the horizontal direction and 12 pixels in the vertical direction.

【００６９】図１の第１の実施の形態において、データ
遅延器８の遅延量は５クロックであり、これは図４で示
した参照画素の読み出し順において、水平方向に１画素
おきに５画素づつ読み出すことに対応している。従っ
て、水平方向の参照画素読み出し範囲を１２画素の場合
は、水平方向に１画素おきに６画素ずつ読み出すことに
なり、データ遅延器の遅延量を６クロックとすればよ
い。第２及び第３の実施の形態においても、同様の変更
で動きベクトルの探索範囲を変更することができる。ま
た、参照画素の垂直方向の読み出し量が多くなれば、読
み出しのためにより多くのクロックサイクルが必要にな
るが、回路構成は変更する必要はない。このように、本
実施の形態においては、データ遅延器の遅延量を変更
し、メモリコントローラの読み出し動作が変わるだけで
簡単に動きベクトルの探索範囲を変更した変形例とする
ことができる。In the first embodiment shown in FIG. 1, the delay amount of the data delay unit 8 is 5 clocks, which is 5 pixels in the horizontal direction every other pixel in the reading order of the reference pixels shown in FIG. It corresponds to reading one by one. Therefore, when the reference pixel readout range in the horizontal direction is 12 pixels, six pixels are read out every other pixel in the horizontal direction, and the delay amount of the data delay unit may be set to six clocks. Also in the second and third embodiments, the search range of the motion vector can be changed by the same change. In addition, when the readout amount of the reference pixel in the vertical direction increases, more clock cycles are required for reading, but the circuit configuration does not need to be changed. As described above, in the present embodiment, it is possible to provide a modification in which the search range of the motion vector is easily changed only by changing the delay amount of the data delay unit and changing the read operation of the memory controller.

【００７０】＜第５の実施の形態＞図１３は本発明に係
る動きベクトル検出装置の第５の実施の形態を示すブロ
ック図である。図５で示した第２の実施の形態の応用例
である。図５に示した第２の実施の形態でより大きな符
号化ブロックを対象にしようとすると、参照レジスタ
４，５に多数の演算ユニットが接続され、負荷が重くな
る場合がある。図１３の実施の形態では参照レジスタを
２組備え、１つの参照レジスタに接続する演算ユニット
数を半分にし、負荷を軽くしている点に特徴がある。<Fifth Embodiment> FIG. 13 is a block diagram showing a fifth embodiment of the motion vector detecting device according to the present invention. This is an application example of the second embodiment shown in FIG. If a larger coding block is targeted in the second embodiment shown in FIG. 5, a large number of arithmetic units are connected to the reference registers 4 and 5, which may increase the load. The embodiment of FIG. 13 is characterized in that two sets of reference registers are provided, the number of arithmetic units connected to one reference register is reduced to half, and the load is reduced.

【００７１】図１３の動きベクトル検出装置は図５に示
した第２の実施の形態の動きベクトル検出装置と比較し
て、参照レジスタ５２，５３が付加され、ＰＥ３２，３
３及びデータ遅延器３５が参照レジスタ５２，５３に接
続される点が異なり、またメモリコントローラ５１の制
御も異なるが、その他は同一の動作をする。The motion vector detecting device of FIG. 13 is different from the motion vector detecting device of the second embodiment shown in FIG.
3 and the data delay unit 35 are connected to the reference registers 52 and 53, and the control of the memory controller 51 is different, but the other operations are the same.

【００７２】図１４に第５の実施の形態の動作をタイム
チャートで示す。図１４は動きベクトルの探索範囲が垂
直方向に−４〜＋３、水平方向に−４〜＋３の場合の動
作を示している。本実施の形態ではＰＥ３２，３３に参
照画素値を供給する参照レジスタ５２，５３の出力が参
照レジスタ４，５の出力タイミングより２クロック遅れ
るため、ＰＥ３２，３３での演算タイミングもＰＥ６，
７より２クロック遅れるが、それ以外は図５に示した第
２の実施の形態と同一の動作をする。タイミングチャー
トも図６に示した第２の実施例のタイムチャートからＰ
Ｅ３２，３３の動作が２クロック遅れた動作となるだけ
で、それ以外は同一となる。図１３は図５に示した第２
の実施の形態を変形した例であるが、図９で示した第３
の実施の形態についても同様の変形例とすることができ
る。FIG. 14 is a time chart showing the operation of the fifth embodiment. FIG. 14 shows the operation when the motion vector search range is -4 to +3 in the vertical direction and -4 to +3 in the horizontal direction. In this embodiment, since the outputs of the reference registers 52 and 53 that supply the reference pixel values to the PEs 32 and 33 are delayed by two clocks from the output timings of the reference registers 4 and 5, the operation timings of the PEs 32 and 33 are also PE6 and PE6.
7, the operation is the same as that of the second embodiment shown in FIG. The timing chart is also P from the time chart of the second embodiment shown in FIG.
Only the operation of E32 and E33 is delayed by two clocks, and the other operations are the same. FIG. 13 shows the second embodiment shown in FIG.
This is an example in which the embodiment is modified, but the third embodiment shown in FIG.
A similar modification can be applied to the embodiment.

【００７３】＜第６の実施の形態＞図１５に本発明の第
６の実施の形態のブロック図を示す。図５で示した第２
の実施の形態と本実施の形態は同等の機能を備えるが、
第２の実施の形態ではＰＥを４個備えていたが、本実施
の形態では参照レジスタを２組備え、さらにＰＥ内に符
号化レジスタも２組持ち、それぞれどちらか一方のレジ
スタ出力を選択することでＰＥの数を２個としている点
に特徴がある。<Sixth Embodiment> FIG. 15 is a block diagram showing a sixth embodiment of the present invention. The second shown in FIG.
Although this embodiment and the present embodiment have equivalent functions,
In the second embodiment, four PEs are provided. In the present embodiment, two sets of reference registers are provided, and two sets of encoding registers are further provided in the PE. One of the register outputs is selected. This is characterized in that the number of PEs is two.

【００７４】図１５の動きベクトル検出装置は、符号化
メモリ１、参照メモリ２、データ遅延器６４を介して直
列に接続された２組の参照レジスタ４，５と６２，６
３、２組みの参照レジスタから１方の出力を選択するセ
レクタ６５，６６、データ遅延器６９を介して直列に接
続されたＰＥ６７，６８で構成されている。The motion vector detecting device shown in FIG. 15 comprises two sets of reference registers 4, 5 and 62, 6 connected in series via an encoding memory 1, a reference memory 2, and a data delay 64.
It is composed of selectors 65 and 66 for selecting one output from three or two sets of reference registers, and PEs 67 and 68 connected in series via a data delay unit 69.

【００７５】図１６に本実施の形態のＰＥ６７，６８の
構成を示す。前述したように、図１６のＰＥは符号化レ
ジスタを２組（２１，２２と７１，７２）備え、どちら
か一方の出力を選択するようになっている。FIG. 16 shows the configuration of the PEs 67 and 68 of the present embodiment. As described above, the PE of FIG. 16 includes two sets of encoding registers (21, 22 and 71, 72), and selects one of the outputs.

【００７６】本実施の形態の動作を図１７のタイムチャ
ートを用いて説明する。図１７は動きベクトルの探索範
囲が垂直方向に−４〜＋３、水平方向に−４〜＋３の場
合の動作であり、画素値の記号は図８に対応している。
符号化画素値の読み出しは第２の実施の形態と同様で、
６列分の参照画素値の読み出しが終了すると同時に次の
符号化ブロックの読み出しを開始する。ＰＥ６７，６８
にはそれぞれ４画素分の符号化レジスタを備え、合計８
画素、即ち２符号化ブロック分の符号化画素値を保持す
る。The operation of the present embodiment will be described with reference to the time chart of FIG. FIG. 17 shows the operation when the search range of the motion vector is -4 to +3 in the vertical direction and -4 to +3 in the horizontal direction, and the symbol of the pixel value corresponds to FIG.
Reading of the encoded pixel value is the same as in the second embodiment,
At the same time when the reading of the reference pixel values for the six columns is completed, the reading of the next encoded block is started. PE67, 68
Has encoding registers for 4 pixels each, and a total of 8
A pixel, that is, an encoded pixel value for two encoded blocks is held.

【００７７】図１８に本実施の形態における参照画素値
の読み出し順を示す。第２の実施の形態では図７で示し
たように参照画素値を横方向に読み出していたが、本実
施の形態では縦方向で読み出す。図１８で、ＰＥ６７，
６８には符号化ブロックａ，ｂの符号化画素が保持さ
れ、参照画素値は１〜１５を番号順に読み出し、次に基
点を１'として１'〜１５'を読み出し、さらに基点を
１''，１'''とずらして図１８の参照画素読み出し範囲
で示した１〜１５'''の６列、６０画素を読み出す。FIG. 18 shows the reading order of the reference pixel values in the present embodiment. In the second embodiment, the reference pixel value is read in the horizontal direction as shown in FIG. 7, but in the present embodiment, the reference pixel value is read in the vertical direction. In FIG. 18, PE67,
Reference numeral 68 denotes a coded pixel of the coded blocks a and b. Reference pixel values are read out in the order of 1 to 15 in numerical order, and then the base point is set to 1 ', and 1' to 15 'are read out. , 1 ″ ″ and 60 pixels in six columns from 1 ″ to 15 ″ ″ shown in the reference pixel reading range of FIG.

【００７８】図１７のタイムチャートにおいて、まず符
号化メモリ１から符号化ブロック０，１の画素値が読み
出され、ＰＥ６７，６８に入力される［時刻Ｔ＝０〜
７］。そして、図１６で示したＰＥ中の符号化レジスタ
２１，２２には符号化ブロック０の画素値が格納され、
符号化レジスタ７１，７２には符号化ブロック１の画素
値が格納される。参照画素は図１８のパターンに従って
６列毎に入力される。まず第１列目のｒ０，ｒ２，ｒ
４，ｒ６，ｒ８が入力され［Ｔ＝０〜４］、参照レジス
タ５，４に順次格納される。またｒ０〜ｒ８は図１５の
データ遅延器６４で２クロック遅延した後、参照レジス
タ６２，６３にも格納される［Ｔ＝４〜８］。次に３ク
ロックのダミーデータ後に第３列目のｒ２２，ｒ２４，
ｒ２６，ｒ２８，ｒ３０が参照レジスタ５，４に入力さ
れる［Ｔ＝８〜１２］。図１５のセレクタ６５，６６は
時刻［Ｔ＝２］から４クロック毎に参照レジスタ４，５
の出力データ選択と参照レジスタ６２，６３の出力デー
タ選択を切り替える。また、ＰＥ内のセレクタ７３，７
４も時刻［Ｔ＝２］から４クロック毎に符号化レジスタ
２１，２２の出力データ選択と符号化レジスタ７１，７
２の出力データ選択を切り替える。In the time chart of FIG. 17, first, the pixel values of the coding blocks 0 and 1 are read from the coding memory 1 and input to the PEs 67 and 68 [time T = 0 to 0].
7]. Then, the pixel values of the coding block 0 are stored in the coding registers 21 and 22 in the PE shown in FIG.
The encoding registers 71 and 72 store pixel values of the encoding block 1. The reference pixels are input every six columns according to the pattern of FIG. First, r0, r2, r in the first column
4, r6, and r8 are input [T = 0 to 4] and sequentially stored in the reference registers 5 and 4. The signals r0 to r8 are also stored in the reference registers 62 and 63 after two clock delays in the data delay unit 64 in FIG. 15 [T = 4 to 8]. Next, after three clocks of dummy data, r22, r24,
r26, r28, and r30 are input to the reference registers 5 and 4 [T = 8 to 12]. The selectors 65 and 66 in FIG. 15 operate the reference registers 4 and 5 every four clocks from time [T = 2].
And the output data selection of the reference registers 62 and 63 are switched. The selectors 73 and 7 in the PE
4 also selects the output data of the encoding registers 21 and 22 and the encoding registers 71 and 7 every four clocks from the time [T = 2].
The output data selection of No. 2 is switched.

【００７９】即ち、［Ｔ＝２〜５］の期間は参照レジス
タ４，５の出力と符号化レジスタ２１，２２に保持され
ているｔ０，ｔ２が選択され、ＰＥ６７での演算結果
は、ｄ０＝｜ｔ０−ｒ０｜＋｜ｔ２−ｒ２｜［Ｔ＝２］ｄ２＝｜ｔ０−ｒ２｜＋｜ｔ２−ｒ４｜［Ｔ＝３］ｄ４＝｜ｔ０−ｒ４｜＋｜ｔ２−ｒ６｜［Ｔ＝４］ｄ６＝｜ｔ０−ｒ６｜＋｜ｔ２−ｒ８｜［Ｔ＝５］となる。［Ｔ＝６〜９］の期間は参照レジスタ６２，６
３と符号化レジスタ７１，７２に保持されているｔ１
６，ｔ１８が選択され、ＰＥ６７での演算結果は、ｇ０＝｜ｔ１６−ｒ０｜＋｜ｔ１８−ｒ２｜［Ｔ＝６］ｇ２＝｜ｔ１６−ｒ２｜＋｜ｔ１８−ｒ４｜［Ｔ＝７］ｇ４＝｜ｔ１６−ｒ４｜＋｜ｔ１８−ｒ６｜［Ｔ＝８］ｇ６＝｜ｔ１６−ｒ６｜＋｜ｔ１８−ｒ８｜［Ｔ＝９］となる。That is, during the period [T = 2 to 5], the outputs of the reference registers 4 and 5 and t0 and t2 held in the encoding registers 21 and 22 are selected, and the operation result of the PE 67 is d0 = | T0-r0 | + | t2-r2 | [T = 2] d2 = | t0-r2 | + | t2-r4 | [T = 3] d4 = | t0-r4 | + | t2-r6 | [T = 4] d6 = | t0−r6 | + | t2−r8 | [T = 5] During the period of [T = 6 to 9], the reference registers 62 and 6
3 and t1 held in the encoding registers 71 and 72
6, t18 are selected, and the calculation result in PE67 is g0 = | t16-r0 | + | t18-r2 | [T = 6] g2 = | t16-r2 | + | t18-r4 | [T = 7] g4 = | t16−r4 | + | t18−r6 | [T = 8] g6 = | t16−r6 | + | t18−r8 | [T = 9]

【００８０】ＰＥ６８でも同様に符号化レジスタ２１，
２２に格納されているｔ８，ｔ１０と符号化レジスタ７
１，７２に格納されているｔ２４，ｔ２６が切り替えて
出力され、参照レジスタ４，５又は参照レジスタ６２，
６３の出力との差分絶対値演算が行われるとともに、デ
ータ遅延器６９で８クロック遅延した前段（ＰＥ６７）
の演算結果と加算され、ｅ０＝ｄ０＋｜ｔ８ −ｒ２２｜＋｜ｔ１０−ｒ２４｜［Ｔ＝１０］ｅ２＝ｄ２＋｜ｔ８ −ｒ２４｜＋｜ｔ１０−ｒ２６｜［Ｔ＝１１］ｅ４＝ｄ４＋｜ｔ８ −ｒ２６｜＋｜ｔ１０−ｒ２８｜［Ｔ＝１２］ｅ６＝ｄ６＋｜ｔ８ −ｒ２８｜＋｜ｔ１０−ｒ３０｜［Ｔ＝１３］ｈ０＝ｇ０＋｜ｔ２４−ｒ２２｜＋｜ｔ２６−ｒ２４｜［Ｔ＝１４］ｈ２＝ｇ２＋｜ｔ２４−ｒ２４｜＋｜ｔ２６−ｒ４６｜［Ｔ＝１５］ｈ４＝ｇ４＋｜ｔ２４−ｒ２６｜＋｜ｔ２６−ｒ２８｜［Ｔ＝１６］ｈ６＝ｇ６＋｜ｔ２４−ｒ２８｜＋｜ｔ２６−ｒ３０｜［Ｔ＝１７］が出力される。Similarly, in the PE 68, the encoding register 21,
T8 and t10 stored in the register 22 and the encoding register 7
T24 and t26 stored in the reference registers 1 and 72 are switched and output.
The preceding stage (PE 67) performs the absolute difference calculation with respect to the output of 63 and delays the data delay 69 by 8 clocks.
E0 = d0 + | t8-r22 | + | t10-r24 | [T = 10] e2 = d2 + | t8-r24 | + | t10-r26 | [T = 11] e4 = d4 + | t8 -R26 | + | t10-r28 | [T = 12] e6 = d6 + | t8 -r28 | + | t10-r30 | [T = 13] h0 = g0 + | t24-r22 | + | t26-r24 | [T = 14] h2 = g2 + | t24-r24 | + | t26-r46 | [T = 15] h4 = g4 + | t24-r26 | + | t26-r28 | [T = 16] h6 = g6 + | t24-r28 | + | t26−r30 | [T = 17] is output.

【００８１】この結果、ｅ０は符号化ブロック０の動き
ベクトルが（０，−４）の予測ブロック候補とのＡＥと
なり、ｅ２は（０，−２）、ｅ４は（０，０）、ｅ６は
（０，＋２）の予測ブロック候補とのＡＥとなる。ま
た、ｈ０は符号化ブロック１の動きベクトルが（−４，
−４）の予測ブロック候補とのＡＥとなり、ｈ２は（−
４，−２）、ｈ６は（−４，０）、ｈ８は（−４，＋
２）の予測ブロック候補とのＡＥとなる。As a result, e0 is the AE of the motion vector of the coded block 0 with the prediction block candidate of (0, -4), e2 is (0, -2), e4 is (0, 0), and e6 is AE with the prediction block candidate of (0, +2). H0 indicates that the motion vector of the encoded block 1 is (−4,
-4) is the AE with the prediction block candidate, and h2 is (−)
4, -2), h6 is (-4,0), h8 is (-4, +
The AE with the prediction block candidate of 2) is obtained.

【００８２】このようにして、５画素の参照画素値入力
と３クロックのダミーデータ入力と繰り返し、時刻Ｔ０
からＴ９５の９６クロックで図８で示したｒ０〜ｒ６４
の６列の参照画素値の入力が終了する。その間に符号化ブロック０の垂直方向 −４〜＋３，水平方向０〜＋３符号化ブロック１の垂直方向 −４〜＋３，水平方向 −４〜−１の動きベクトル探索範囲の予測ブロック候補との誤差演
算を行う。続いてＰＥ６７，６８の符号化レジスタ２
１，２２に符号化ブロック２の画素値を入力し、［Ｔ＝
９６〜１００］ｒ４４〜ｒ１０８の６列の参照画素値を
入力し、順次誤差演算を行う。In this way, the input of the reference pixel value of five pixels and the input of dummy data of three clocks are repeated, and the time T0
To r64 shown in FIG.
The input of the reference pixel values of the six columns is ended. In the meantime, the error between the prediction block candidate in the motion vector search range in the vertical direction -4 to +3 and the horizontal direction -4 to -3 in the coding block 0 in the vertical direction -4 to +3 and the horizontal direction 0 to +3 in the coding block 1 Perform the operation. Subsequently, the encoding register 2 of the PEs 67 and 68
The pixel values of the encoding block 2 are input to 1, 2 and [T =
96 to 100] The reference pixel values in six columns r44 to r108 are input, and error calculation is sequentially performed.

【００８３】以上の演算を繰り返すことで、９６クロッ
ク毎に１符号化ブロックの全予測ブロック候補とのＡＥ
が出力される。図５の第２の実施例では１符号化ブロッ
クの処理に６０クロック要していたので、１つの演算ユ
ニット当たりでは２倍の演算を約１．５倍のクロックで
行うことができる。即ち、クロック数を約１．５倍にす
ることで、半分のＰＥで同等機能を実現することができ
る。By repeating the above operation, the AE with all the prediction block candidates of one encoded block every 96 clocks
Is output. In the second embodiment shown in FIG. 5, 60 clocks are required for processing of one encoded block, so that one operation unit can perform twice as many operations with about 1.5 times as many clocks. That is, by increasing the number of clocks by about 1.5 times, equivalent functions can be realized with half the PEs.

【００８４】以上で述べてきた各実施例では、誤差演算
量として差分絶対値和を用いていたが、差分の自乗和な
ど、その他の指標を用いても構わない。その際、例えば
図２や図１６の差分絶対値演算器を差分の自乗を求める
演算器とするだけで、その他は変更の必要はない。ま
た、符号化ブロックの画素数やサブサンプリングパター
ン、動きベクトル探索範囲は実施の形態で示したものに
限らず、様々な場合においても本発明は適用可能であ
る。In each of the embodiments described above, the difference absolute value sum is used as the error calculation amount, but another index such as the sum of squares of the difference may be used. At this time, for example, only the difference absolute value calculator of FIGS. 2 and 16 is used as a calculator for calculating the square of the difference, and the other components do not need to be changed. Further, the number of pixels, the sub-sampling pattern, and the motion vector search range of the coding block are not limited to those described in the embodiment, and the present invention is applicable to various cases.

【００８５】[0085]

【発明の効果】請求項１〜７記載の本発明によれば、マ
ッチング演算の累積加算をサブサンプリングパターンに
基づく順序とすることで、演算ユニットの数やデータ遅
延器の回路規模を従来例と比較して大幅に小さくするこ
とができる。また、垂直方向の探索領域が変化しても参
照画素の読み出し量が変わるだけで、演算ユニットやデ
ータ遅延器の構成を変える必要がない。また水平方向の
探索領域が変化しても、参照画素の読み出し順とデータ
遅延器の遅延量を変更するだけで、演算ユニットの構成
を変更する必要がない。従って、探索領域の変更を容易
に行うことができる。According to the present invention, the cumulative addition of the matching operation is performed in an order based on the sub-sampling pattern, so that the number of operation units and the circuit scale of the data delay unit are different from those of the conventional example. It can be significantly reduced in comparison. Also, even if the search area in the vertical direction changes, only the readout amount of the reference pixel changes, and there is no need to change the configuration of the arithmetic unit or the data delay unit. Even if the search area in the horizontal direction changes, only the read order of the reference pixels and the delay amount of the data delay unit are changed, and there is no need to change the configuration of the arithmetic unit. Therefore, the search area can be easily changed.

【００８６】また請求項２及び３記載の本発明によれ
ば、２つの符号化ブロックの誤差演算を並列して行うた
め、動作周波数をより低くすることができる。According to the second and third aspects of the present invention, since the error calculation of two encoded blocks is performed in parallel, the operating frequency can be further reduced.

【００８７】また請求項４記載の本発明によれば、請求
項３記載の本発明と回路規模がほぼ同じの場合、参照画
素の読み出し量の増加分以上に動きベクトルの探索範囲
を増加させることができ、広い範囲を探索する場合に一
層動作周波数を低くすることができ、低消費電力化がは
かれる。According to the fourth aspect of the present invention, when the circuit scale is substantially the same as that of the third aspect of the present invention, the search range of the motion vector is increased by more than the readout amount of the reference pixel. The operating frequency can be further reduced when searching a wide range, and power consumption can be reduced.

【００８８】また請求項５及び６記載の本発明によれ
ば、隣接する符号化ブロックの動きベクトル検出処理を
１つの演算ユニットを時分割使用して行うことにより、
クロックを増加させれば、演算ユニット数を減少させる
ことができ、回路の小型化を図ることができる。According to the fifth and sixth aspects of the present invention, the motion vector detection processing of adjacent coded blocks is performed by using one arithmetic unit by time division.
If the clock is increased, the number of operation units can be reduced, and the size of the circuit can be reduced.

【００８９】また請求項７記載の本発明によれば、マッ
チング演算として自乗誤差を用いても同等の効果を得る
ことができる。According to the present invention, the same effect can be obtained even when the square error is used as the matching operation.

[Brief description of the drawings]

【図１】本発明に係る動きベクトル検出装置の第１の実
施の形態を示すブロック図である。FIG. 1 is a block diagram showing a first embodiment of a motion vector detection device according to the present invention.

【図２】第１の実施の形態の演算ユニットのブロック図
である。FIG. 2 is a block diagram of an arithmetic unit according to the first embodiment.

【図３】第１の実施の形態の参照レジスタ及び演算ユニ
ットの動作を示すタイミングチャートである。FIG. 3 is a timing chart illustrating operations of a reference register and an arithmetic unit according to the first embodiment.

【図４】第１の実施の形態の参照画像の読み出し順を説
明する図である。FIG. 4 is a diagram illustrating a reading order of reference images according to the first embodiment.

【図５】本発明に係る動きベクトル検出装置の第２の実
施の形態を示すブロック図である。FIG. 5 is a block diagram showing a second embodiment of the motion vector detecting device according to the present invention.

【図６】第２の実施の形態の参照レジスタ及び演算ユニ
ットの動作を示すタイミングチャートである。FIG. 6 is a timing chart illustrating operations of a reference register and an arithmetic unit according to the second embodiment.

【図７】第２の実施の形態の参照画像の読み出し順を説
明する図である。FIG. 7 is a diagram illustrating a reading order of reference images according to the second embodiment.

【図８】第２の実施の形態の参照画像の読み出し順及び
誤差演算を行う符号化ブロックとの関係を説明する図で
ある。FIG. 8 is a diagram illustrating a relationship between a reading order of reference images and an encoding block that performs an error operation according to the second embodiment.

【図９】本発明に係る動きベクトル検出装置の第３の実
施の形態を示すブロック図である。FIG. 9 is a block diagram showing a third embodiment of the motion vector detection device according to the present invention.

【図１０】第３の実施の形態の参照レジスタ及び演算ユ
ニットの動作を示すタイミングチャートである。FIG. 10 is a timing chart illustrating operations of a reference register and an arithmetic unit according to the third embodiment.

【図１１】第３の実施の形態の参照画像の読み出し順を
説明する図である。FIG. 11 is a diagram illustrating a reading order of reference images according to the third embodiment.

【図１２】第３の実施の形態の参照画像の読み出し順及
び誤差演算を行う符号化ブロックとの関係を説明する図
である。FIG. 12 is a diagram illustrating a relationship between a reading order of reference images and an encoding block that performs an error operation according to the third embodiment.

【図１３】本発明に係る動きベクトル検出装置の第５の
実施の形態を示すブロック図である。FIG. 13 is a block diagram showing a fifth embodiment of the motion vector detection device according to the present invention.

【図１４】第５の実施の形態の参照レジスタ及び演算ユ
ニットの動作を示すタイミングチャートである。FIG. 14 is a timing chart illustrating operations of a reference register and an arithmetic unit according to the fifth embodiment.

【図１５】本発明に係る動きベクトル検出装置の第６の
実施の形態を示すブロック図である。FIG. 15 is a block diagram showing a motion vector detecting device according to a sixth embodiment of the present invention.

【図１６】第６の実施の形態の演算ユニットのブロック
図である。FIG. 16 is a block diagram of an arithmetic unit according to the sixth embodiment.

【図１７】第６の実施の形態の参照レジスタ及び演算ユ
ニットの動作を示すタイミングチャートである。FIG. 17 is a timing chart illustrating operations of a reference register and an arithmetic unit according to the sixth embodiment.

【図１８】第６の実施の形態の参照画像の読み出し順を
説明する図である。FIG. 18 is a diagram illustrating a reading order of reference images according to the sixth embodiment.

【図１９】従来の動きベクトル検出装置を示すブロック
図である。FIG. 19 is a block diagram showing a conventional motion vector detection device.

【図２０】従来の動きベクトル検出装置の演算ユニット
を示すブロック図である。FIG. 20 is a block diagram showing an arithmetic unit of a conventional motion vector detection device.

【図２１】従来例における動きベクトル探索範囲の例で
ある。FIG. 21 is an example of a motion vector search range in a conventional example.

【図２２】従来例における符号化画素と参照画素の関係
を示す図である。FIG. 22 is a diagram illustrating a relationship between an encoded pixel and a reference pixel in a conventional example.

【図２３】従来例における誤差量の累積加算を説明する
タイムチャートである。FIG. 23 is a time chart for explaining cumulative addition of error amounts in a conventional example.

【図２４】従来例における誤差量の累積加算を説明する
タイムチャートである。FIG. 24 is a time chart for explaining cumulative addition of error amounts in a conventional example.

【図２５】符号化ブロックのサブサンプリングパターン
の例である。FIG. 25 is an example of a sub-sampling pattern of a coding block.

【図２６】符号化ブロックをサブサンプリングした場合
の従来の動きベクトル検出装置を示すブロック図であ
る。FIG. 26 is a block diagram showing a conventional motion vector detection device when an encoded block is sub-sampled.

【図２７】符号化ブロックをサブサンプリングした場合
の従来の動きベクトル検出装置の演算ユニットを示すブ
ロック図である。FIG. 27 is a block diagram illustrating an arithmetic unit of a conventional motion vector detection device when an encoded block is sub-sampled.

[Explanation of symbols]

１符号化メモリ２参照メモリ３メモリコントローラ４，５参照レジスタ６，７演算ユニット８データ遅延器 DESCRIPTION OF SYMBOLS 1 Encoding memory 2 Reference memory 3 Memory controller 4, 5 Reference register 6, 7 Operation unit 8 Data delay unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者藤原陽一大阪府大阪市阿倍野区長池町22番22号シャープ株式会社内Ｆターム(参考） 5C059 KK15 KK19 KK49 LB05 MA27 NN01 NN03 NN28 NN31 TA62 TA63 TB08 TC03 TC12 TD05 TD06 UA34 UA38 ────────────────────────────────────────────────── ─── Continuing on the front page (72) Inventor Yoichi Fujiwara F-term (reference) in Sharp Corporation 22-22 Nagaikecho, Abeno-ku, Osaka-shi, Osaka 5C059 KK15 KK19 KK49 LB05 MA27 NN01 NN03 NN28 NN31 TA62 TA63 TB08 TC03 TC12 TD05 TD06 UA34 UA38

Claims

[Claims]

1. A pixel constituting an encoding block,
Using a representative pixel value subsampled in a predetermined pattern, block matching is performed on all prediction block candidates in a search area, and a motion vector detection device that detects a motion vector by finding a prediction block that best matches the encoded block is used. An encoding memory that stores an encoding block, a reference memory that stores a reference image value of a prediction block candidate, a memory controller that controls input / output of the encoding memory and the reference memory, and a reference pixel value of the reference memory. A reference register for storing the data, a plurality of data delay units for delaying data output timing, and a plurality of operation units connected in series via the data delay unit, wherein the memory controller has a sub-sampled code. From the encoding memory to the arithmetic unit. In addition to the scattered output, a reference pixel value of a prediction block candidate corresponding to a representative pixel value of the coding block is read from the reference memory based on the sub-sampling pattern, and all predictions in a search area are stored in the reference register. The reference pixel values of the block candidates are sequentially input, and the arithmetic unit holds the pixel values of the sub-sampled encoded block, and stores the pixel value of the encoded block and the reference register corresponding to the pixel value. A calculation is performed to obtain an error amount between the pixel value and the error value, and the error amount is cumulatively added to an output value of another operation unit. The unit adjusts the output timing so that it is cumulatively added to the error amount of the same prediction block candidate, and outputs the error output from the final stage arithmetic unit. A motion vector detecting device, wherein a motion vector is determined using a predicted block candidate having a minimum value in a quantity as a predicted block.

2. A plurality of sets of the plurality of data delay units and the plurality of arithmetic units, wherein one set of arithmetic units holds a representative pixel value of an already inputted coding block, The set of operation units outputs representative pixel values of a coding block adjacent to the coding block from the coding memory, and both sets of the operation units include a reference pixel of the common prediction block candidate and the representative pixel value. 2. The motion vector detecting device according to claim 1, wherein the calculation is performed by:

3. A memory controller comprising two sets of the plurality of data delay units and the plurality of arithmetic units, wherein the memory controller outputs a representative pixel value of one coding block to the one set of arithmetic units from the coding memory. Then, the other set of operation units outputs representative pixel values of every other coded block from the coded memory for the coded block, and both sets of the operation units generate common prediction block candidates. 2. The motion vector detecting device according to claim 1, wherein an operation is performed on a reference pixel and the representative pixel value.

4. The arithmetic unit includes: M encoding registers for storing encoded pixel values of the number M of pixels on one side of a subsampled encoded block; M number of difference absolute value calculators for calculating a difference absolute value from a corresponding pixel value of the reference register, and an adder for adding an output of a preceding stage arithmetic unit and an output of the M number of difference absolute value calculators The motion vector detecting device according to claim 1, comprising:

5. The arithmetic unit according to claim 5, wherein the number of the reference registers is 2M, and further comprising a selector for selecting and outputting M data from the reference registers, and selecting only valid M register outputs by the selector. 5. The motion vector detecting apparatus according to claim 4, wherein the error calculation in a search range overlapping between the coding blocks is performed by a common operation unit by time division.

6. The arithmetic unit includes: M first encoding registers for storing encoded pixel values of M pixels on one side of a subsampled encoded block; and a code including the encoded pixel values. M second coded registers for storing M coded pixels of a coded block adjacent to a coded block, and M register outputs selected from a total of 2M first and second coded registers And a selector for adding the outputs of the other absolute value arithmetic units and the output of the absolute difference value arithmetic unit. The number of reference registers is 2M. A selector for selecting and outputting M data from the reference register, selecting only valid M register outputs and supplying the selected output to the arithmetic unit; Motion vector detecting apparatus according to claim 1, characterized in that the time division the difference calculated by the common calculation unit.

7. The method according to claim 4, wherein a difference square calculator is used instead of the absolute difference calculator.
The motion vector detecting device according to claim 1.