JP2010041408A

JP2010041408A - Moving image encoding apparatus, moving image decoding apparatus, moving image encoding method and moving image decoding method

Info

Publication number: JP2010041408A
Application number: JP2008202105A
Authority: JP
Inventors: Takayuki Sugawara; 隆幸菅原; Seiji Higure; 誠司日暮
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2008-08-05
Filing date: 2008-08-05
Publication date: 2010-02-18

Abstract

<P>PROBLEM TO BE SOLVED: To provide moving image encoding technologies with which high-definition progressive images are highly efficiently encoded together with standard interlaced images. <P>SOLUTION: A first interlaced image and a second interlaced image are extracted from a moving image, wherein with the first interlaced image, odd-numbered fields and even-numbered fields are alternately extracted in a predetermined frequency and with the second interlaced image, odd-numbered fields and even-numbered fields are extracted with a phase which is inverse to that of the first interlaced image. A first interlaced moving image encoder 107 encodes the first interlaced image to produce first interlaced encoded data. A second interlaced moving image encoder 109 encodes the second interlaced image while using at least a first interlaced decoded image, that is obtained by decoding the first interlaced encoded data, as a reference image for predictive coding to produce the second interlaced encoded data. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、動画像を符号化する動画像符号化装置および動画像符号化方法、ならびに符号化された動画像データを復号化する動画像復号化装置および動画像復号化方法に関する。 The present invention relates to a moving image encoding device and a moving image encoding method for encoding a moving image, and a moving image decoding device and a moving image decoding method for decoding encoded moving image data.

ビデオにはインターレース画像とプログレッシブ画像があり、１秒当たり６０コマ（正確には５９．９４Ｈｚ）であればプログレッシブ画像の表示のほうが情報量も多く高品位も画質が表現できる。しかしながら、現在のＤＶＤ（Digital Versatile Disk）や放送などは基本的にはインターレース画像が用いられている。即ち６０コマではあるが、画像のラインを１本おきに間引いて、交互に表示するようにしている。最近のＤＶＤプレーヤーや、ＤＴＶ（Digital Television）機器においては、そのインターレース画像の欠点を補うべく、表示側で、画像の特徴や時間方向の動きなどを利用してインターレース画像をプログレッシブ画像へ擬似的に変換して、フォーマット上プログレッシブにして画面表示する形態も多く見られるようになった。 A video has an interlaced image and a progressive image. If 60 frames per second (exactly 59.94 Hz), the progressive image display has a larger amount of information and a higher quality image can be expressed. However, interlaced images are basically used in current DVDs (Digital Versatile Disks) and broadcasts. That is, although it is 60 frames, every other line of the image is thinned out and displayed alternately. In recent DVD players and DTV (Digital Television) devices, in order to compensate for the shortcomings of interlaced images, on the display side, the interlaced images are simulated into progressive images using the characteristics of the images and temporal movements. Many forms have been converted and displayed progressively on the screen.

画像符号化技術として、特許文献１には、インターレース画像のもっとも基本的な符号化方法として、１フレームを構成する奇数および偶数の２つのフィールドをフィールド単位で、且つ連続的に符号化する方法が開示されている。 As an image encoding technique, Patent Document 1 discloses a method for continuously encoding two odd and even fields constituting one frame on a field basis as the most basic encoding method for interlaced images. It is disclosed.

特許文献２には、ビデオ画像が種々の解像度スケールおよび画像フォーマットで復号化されることを可能にするようなスケーラブルな方式でビデオ信号を効率的に符号化する方法が開示されている。 Patent Document 2 discloses a method for efficiently encoding a video signal in a scalable manner that allows a video image to be decoded in various resolution scales and image formats.

国際標準化した動画像符号化技術としてＭＰＥＧ（Moving Pictures Expert Group）が挙げられる。ＭＰＥＧは１９８８年、ＩＳＯ／ＩＥＣＪＴＣ１／ＳＣ２（国際標準化機構/国際電気標準化会合同技術委員会１／専門部会２、現在ＳＣ２９）に設立された動画像符号化標準を検討する組織の名称の略称である。ＭＰＥＧ１（ＭＰＥＧフェーズ１）は１．５Ｍｂｐｓ程度の蓄積メディアを対象とした標準であり、静止画符号化を目的としたＪＰＥＧと、ＩＳＤＮのテレビ会議やテレビ電話の低転送レート用の動画像圧縮を目的としたＨ．２６１（ＣＣＩＴＴＳＧＸＶ、現在のＩＴＵ−ＴＳＧ１５で標準化）の基本的な技術を受け継ぎ、蓄積メディア用に新しい技術を導入したものである。これらは１９９３年８月、ＩＳＯ／ＩＥＣ１１１７２として成立している。 MPEG (Moving Pictures Expert Group) can be cited as an internationally standardized moving picture coding technique. MPEG is an abbreviation of the name of an organization that examines the video coding standard established in 1988 by ISO / IEC JTC1 / SC2 (International Organization for Standardization / International Electrotechnical Standards Meeting Technical Committee 1 / Technical Committee 2, currently SC29). It is. MPEG1 (MPEG Phase 1) is a standard for storage media of about 1.5 Mbps, and JPEG for the purpose of still image coding, and video compression for low transfer rates of ISDN video conferences and videophones. The target H.C. It inherits the basic technology of H.261 (CCITT SGXV, standardized by the current ITU-T SG15), and introduces a new technology for storage media. These were established in August 1993 as ISO / IEC11172.

ＭＰＥＧ２（ＭＰＥＧフェーズ２）は通信や放送などの多様なアプリケーションに対応できるように汎用標準を目的として、１９９４年１１月ＩＳＯ／ＩＥＣ１３８１８、H.２６２として成立している。 MPEG2 (MPEG Phase 2) was established as ISO / IEC 13818, H.262 in November 1994 for the purpose of general-purpose standards so as to be compatible with various applications such as communication and broadcasting.

ＭＰＥＧは幾つかの技術を組み合わせて実現される。図１１を参照して、一般的なＭＰＥＧ符号化装置の例を説明する。差分器９１０は、入力画像と、動き補償予測器９０９で復号化した画像との間で差分を取ることで時間冗長部分を削減する。予測方向は、過去のみ、未来のみ、過去と未来の双方の３モードが存在する。予測方向は１６画素×１６画素のＭＢ（マクロブロック）ごとに切り替えることができる。予測方向は入力画像に与えられたピクチャタイプによって決定される。過去から予測するモードと、予測をしないでそのＭＢを独立に符号化するモードの２つが存在するのがＰピクチャである。未来から予測するモード、過去から予測するモード、過去と未来の双方から予測するモード、独立で符号化するモードの４つが存在するのがＢピクチャである。そして全てのＭＢを独立で符号化するのがＩピクチャである。 MPEG is realized by combining several technologies. An example of a general MPEG encoding apparatus will be described with reference to FIG. The differentiator 910 reduces the time redundant portion by taking the difference between the input image and the image decoded by the motion compensation predictor 909. There are three prediction directions: past only, only future, and both past and future. The prediction direction can be switched for each MB (macroblock) of 16 pixels × 16 pixels. The prediction direction is determined by the picture type given to the input image. There are two modes: a mode that predicts from the past and a mode that independently encodes the MB without prediction. There are four types of B pictures: a mode predicting from the future, a mode predicting from the past, a mode predicting from both the past and the future, and a mode encoding independently. An I picture encodes all MBs independently.

動き補償は、動き領域をＭＢごとにパターンマッチングを行ってハーフペル精度で動きベクトルを検出し、動き分だけシフトしてから予測することで行われる。動きベクトルは水平方向と垂直方向の成分を有し、予測方向を示すＭＣ（Motion Compensation）モードとともにＭＢの付加情報として伝送される。Ｉピクチャから次のＩピクチャの前のピクチャまでをＧＯＰ（Group Of Picture）といい、蓄積メディアなどで使用される場合には、一般に約１５ピクチャ程度がＧＯＰとして使用される。 Motion compensation is performed by performing pattern matching on the motion region for each MB, detecting a motion vector with half-pel accuracy, and performing a prediction after shifting the motion vector. The motion vector has horizontal and vertical components, and is transmitted as additional information of the MB together with the MC (Motion Compensation) mode indicating the prediction direction. A group from an I picture to a picture before the next I picture is called a GOP (Group Of Picture). When used in storage media, about 15 pictures are generally used as a GOP.

差分器９１０により得られた差分画像はＤＣＴ（Discrete Cosine Transform）器９０１において直交変換が行われる。ＤＣＴとは余弦関数を積分核とした積分変換を有限空間への離散変換する直交変換である。ＭＰＥＧではＭＢを４分割し８×８のＤＣＴブロックに対して、２次元ＤＣＴを行う。一般にビデオ信号は低域成分が多く高域成分が少ないため、ＤＣＴを行うと係数が低域に集中する。 The difference image obtained by the difference unit 910 is subjected to orthogonal transformation in a DCT (Discrete Cosine Transform) unit 901. DCT is an orthogonal transformation that discretely transforms an integral transformation with a cosine function as an integral kernel into a finite space. In MPEG, MB is divided into four and two-dimensional DCT is performed on 8 × 8 DCT blocks. In general, a video signal has many low-frequency components and few high-frequency components. Therefore, when DCT is performed, coefficients are concentrated in a low frequency.

ＤＣＴ器９０１により直交変換された画像データ（ＤＣＴ係数）は量子化器９０２において量子化される。量子化は、量子化マトリックスという８×８の２次元周波数を視覚特性で重み付けした値に、その全体をスカラ倍する量子化スケールという値を乗算した値を量子化値として、ＤＣＴ係数をその量子化値で除算することで行われる。デコーダ側で逆量子化するときは量子化値で乗算することにより、元のＤＣＴ係数に近似している値を得ることになる。 The image data (DCT coefficient) orthogonally transformed by the DCT unit 901 is quantized by the quantizer 902. For quantization, a value obtained by multiplying a value obtained by weighting an 8 × 8 two-dimensional frequency called a quantization matrix with a visual characteristic and a value called a quantization scale for multiplying the whole by a scalar is used as a quantized value, and a DCT coefficient is quantized. This is done by dividing by the digitized value. When inverse quantization is performed on the decoder side, a value approximate to the original DCT coefficient is obtained by multiplying by the quantized value.

量子化されたデータはＶＬＣ（Variable Length Coding）器９０３で可変長符号化される。量子化された値のうち直流（ＤＣ）成分は予測符号化のひとつであるＤＰＣＭ（differential pulse code modulation)により符号化される。交流（ＡＣ）成分は低域から高域にジグザグスキャンを行い、ゼロのラン長および有効係数値を１つの事象とし、出現確率の高いものから符号長の短い符号を割り当てていくハフマン符号化が行われる。 The quantized data is subjected to variable length coding by a VLC (Variable Length Coding) unit 903. Of the quantized values, a direct current (DC) component is encoded by DPCM (differential pulse code modulation) which is one of predictive encoding. The alternating current (AC) component is zigzag scanned from low to high, with zero run length and effective coefficient value as one event, and Huffman coding that assigns codes with a short code length from those with a high probability of appearance. Done.

可変長符号化されたデータは一時バッファ９０４に蓄えられ、所定の転送レートで符号化データとして出力される。また、出力される符号化データのマクロブロック毎の発生符号量は、符号量制御器９０５に送信される。符号量制御器９０５は、目標符号量に対する発生符号量の誤差を量子化器９０２にフィードバックする。量子化器９０２では、フィードバックされた誤差符号量にもとづいて量子化スケールを調整する。これにより、符号量制御器９０５は生成される符号化データの符号量を適正なレートに制御することができる。 The variable-length encoded data is stored in the temporary buffer 904 and output as encoded data at a predetermined transfer rate. The generated code amount for each macroblock of the encoded data to be output is transmitted to the code amount controller 905. The code amount controller 905 feeds back the error of the generated code amount with respect to the target code amount to the quantizer 902. The quantizer 902 adjusts the quantization scale based on the fed back error code amount. Thereby, the code amount controller 905 can control the code amount of the generated encoded data to an appropriate rate.

量子化された画像データは逆量子化器９０６にて逆量子化され、逆ＤＣＴ器９０７にて逆ＤＣＴされ一時、画像メモリ９０８に蓄えられたのち、動き補償予測器９０９において、差分画像を計算するためのリファレンスの復号化画像として使用される。 The quantized image data is inversely quantized by the inverse quantizer 906, inversely DCTed by the inverse DCT unit 907, temporarily stored in the image memory 908, and then calculated by the motion compensation predictor 909. It is used as a reference decoded image.

次に図１２を参照して、一般的なＭＰＥＧ復号化装置の例を説明する。符号化された動画像データはバッファ１００１にてバッファリングされ、バッファからのデータはＶＬＤ器１００２に入力される。ＶＬＤ器１００２において画像データが可変長復号化され、直流（ＤＣ）成分および交流（ＡＣ）成分が得られる。交流（ＡＣ）成分データは低域から高域にジグザグスキャンの順で８×８のマトリックスに配置される。このデータは逆量子化器１００３に入力され、量子化マトリックスにて逆量子化される。逆量子化されたデータは逆ＤＣＴ器１００４に入力され、逆ＤＣＴ変換され、画像データ（復号化データ）として出力される。また、復号化データは一時、画像メモリー１００６に蓄えられたのち、動き補償予測器１００５において、差分画像を計算するためのリファレンスの復号化画像として使用される。
特開平６−９８３１４号公報特開平７−１６２８７０号公報 Next, an example of a general MPEG decoding apparatus will be described with reference to FIG. The encoded moving image data is buffered in the buffer 1001, and the data from the buffer is input to the VLD unit 1002. The VLD unit 1002 performs variable length decoding on the image data to obtain a direct current (DC) component and an alternating current (AC) component. The alternating current (AC) component data is arranged in an 8 × 8 matrix in the zigzag scan order from the low range to the high range. This data is input to an inverse quantizer 1003 and inversely quantized by a quantization matrix. The inversely quantized data is input to the inverse DCT unit 1004, subjected to inverse DCT conversion, and output as image data (decoded data). The decoded data is temporarily stored in the image memory 1006 and then used as a reference decoded image for calculating a difference image in the motion compensation predictor 1005.
JP-A-6-98314 JP-A-7-162870

標準化されたＭＰＥＧ２のインターレース動画像符号化を採用したアプリケーションが非常に多いことから、動画像の符号化においては、互換性を保つために標準的なインターレース画像が符号化されていることが必要である。しかし、同時に高品質の動画像を再生したいというニーズがあることから、標準的なインターレース画像とともに、高品位のプログレッシブ画像を高効率で符号化することのできる動画像符号化技術が求められている。一つの動画像データの中に、標準的なインターレース画像以外にプログレッシブ画像も符号化されていれば、アプリケーションのニーズに応じて、インターレース画像、プログレッシブ画像のいずれかを選んで利用することができるようになる。 Since there are so many applications that employ standardized MPEG2 interlaced video encoding, standard interlaced images must be encoded in order to maintain compatibility when encoding video. is there. However, since there is a need to reproduce high-quality moving images at the same time, there is a need for a moving image coding technique that can efficiently encode high-quality progressive images along with standard interlaced images. . If progressive video is encoded in addition to standard interlaced images in one moving image data, either interlaced images or progressive images can be selected and used according to the needs of the application. become.

しかしながら、インターレース符号化やプログレッシブ符号化の方式、あるいは、解像度の違う画像において差分をとって階層符号化する方式などが開発されているが、符号化効率を維持しつつ、標準のインターレース動画像と完全な互換性を保ちながら、プログレッシブ動画像を合わせて符号化することは現状ではできていない。 However, methods such as interlace coding and progressive coding, or hierarchical coding by taking differences in images with different resolutions have been developed, and while maintaining coding efficiency, Currently, it is not possible to encode progressive video together while maintaining complete compatibility.

本発明はこうした状況に鑑みてなされたものであり、その目的は、動画像データのインターレース再生とプログレッシブ再生を両立させることが可能な動画像符号化技術および動画像復号化技術を提供することにある。 The present invention has been made in view of such circumstances, and an object of the present invention is to provide a moving image encoding technique and a moving image decoding technique capable of achieving both interlaced reproduction and progressive reproduction of moving image data. is there.

上記課題を解決するために、本発明のある態様の動画像符号化装置は、動画像から奇数番目の走査線からなる奇数フィールドと偶数番目の走査線からなる偶数フィールドが所定の周波数で交互に取り出された第１インターレース画像と、前記動画像から奇数フィールドと偶数フィールドが前記第１インターレース画像とは逆位相で取り出された第２インターレース画像とを取り出す抽出部と、前記第１インターレース画像を符号化して第１インターレース符号化データを生成する第１符号化部と、前記第１インターレース符号化データを復号化して得られる第１インターレース復号化画像を予測符号化の参照画像として少なくとも用いて前記第２インターレース画像を符号化することにより、第２インターレース符号化データを生成する第２符号化部とを含む。 In order to solve the above-described problem, a moving picture coding apparatus according to an aspect of the present invention includes an odd field composed of odd-numbered scanning lines and an even field composed of even-numbered scanning lines alternately from a moving picture at a predetermined frequency. An extraction unit that extracts the extracted first interlaced image, and a second interlaced image in which the odd and even fields are extracted from the moving image in an opposite phase to the first interlaced image; and the first interlaced image is encoded A first encoding unit that generates first interlaced encoded data and a first interlaced decoded image obtained by decoding the first interlaced encoded data at least as a reference image for predictive encoding A second interlaced encoded data is generated by encoding a two interlaced image. And a No. unit.

本発明の別の態様もまた、動画像符号化装置である。この装置は、動画像から奇数番目の走査線からなる奇数フィールドと偶数番目の走査線からなる偶数フィールドが所定の周波数で交互に取り出された第１インターレース画像と、奇数番目および偶数番目の走査線をともに含むプログレッシブ画像とを取り出す抽出部と、前記第１インターレース画像を符号化して第１インターレース符号化データを生成する第１符号化部と、前記第１インターレース符号化データを復号化して得られる第１インターレース復号化画像と、前記抽出部から供給されるプログレッシブ画像に含まれる第１インターレース画像とを予測符号化の参照画像として少なくとも用いて、奇数フィールドと偶数フィールドが前記第１インターレース画像とは逆位相である第２インターレース画像を符号化することにより、第２インターレース符号化データを生成し、前記プログレッシブ画像に含まれる第１インターレース画像を符号化した第１インターレース符号化データと、生成された前記第２インターレース符号化データとを合成することにより、プログレッシブ符号化データを生成するプログレッシブ符号化部とを含む。 Another aspect of the present invention is also a moving image encoding apparatus. The apparatus includes a first interlaced image obtained by alternately extracting an odd field composed of odd-numbered scan lines and an even field composed of even-numbered scan lines from a moving image at a predetermined frequency, and odd-numbered and even-numbered scan lines. Obtained by extracting a progressive image including both, a first encoding unit that encodes the first interlaced image to generate first interlaced encoded data, and decoding the first interlaced encoded data Using at least the first interlaced decoded image and the first interlaced image included in the progressive image supplied from the extraction unit as a reference image for predictive coding, the odd field and the even field are the first interlaced image. By encoding the second interlaced image that is in anti-phase, Progressive coding is performed by generating interlace coded data and combining the first interlace coded data obtained by coding the first interlace image included in the progressive image and the generated second interlace coded data. A progressive encoder for generating data.

本発明のさらに別の態様は、動画像復号化装置である。この装置は、動画像符号化データから、奇数番目の走査線からなる奇数フィールドと偶数番目の走査線からなる偶数フィールドが所定の周波数で交互に取り出された第１インターレース画像が符号化された第１インターレース符号化データと、奇数フィールドと偶数フィールドが前記第１インターレース画像とは逆位相である第２インターレース画像が符号化された第２インターレース符号化データとを分離して取り出す分離部と、前記第１インターレース符号化データを復号化して第１インターレース復号化画像を生成する第１復号化部と、第１インターレース復号化画像を参照画像として少なくとも用いて予測することにより、前記第２インターレース符号化データを復号化して第２インターレース復号化画像を生成する第２復号化部と、前記第１インターレース復号化画像と前記第２インターレース復号化画像を合成することにより、プログレッシブ画像を生成するプログレッシブ画像生成部とを含む。 Yet another embodiment of the present invention is a video decoding device. This apparatus encodes a first interlaced image obtained by encoding, from moving image encoded data, an odd field consisting of odd-numbered scanning lines and an even field consisting of even-numbered scanning lines alternately extracted at a predetermined frequency. A separation unit that separates and extracts one interlace encoded data and second interlace encoded data obtained by encoding a second interlace image in which odd and even fields are in opposite phases to the first interlace image; A first decoding unit that decodes the first interlaced encoded data to generate a first interlaced decoded image; and the second interlaced encoding by performing prediction using at least the first interlaced decoded image as a reference image. A second decoding unit for decoding data to generate a second interlaced decoded image; By combining the second interlace decoded picture and the first interlace decoded picture, and a progressive image generating unit for generating a progressive image.

本発明のさらに別の態様もまた、動画像復号化装置である。この装置は、動画像符号化データから、奇数番目の走査線からなる奇数フィールドと偶数番目の走査線からなる偶数フィールドが所定の周波数で交互に取り出された第１インターレース画像が符号化された第１インターレース符号化データと、奇数番目および偶数番目の走査線をともに含むプログレッシブ画像が符号化されたプログレッシブ符号化データとを分離して取り出す分離部と、前記第１インターレース符号化データを復号化して第１インターレース復号化画像を生成する第１復号化部と、前記第１復号化部により復号化された第１インターレース復号化画像とプログレッシブ符号化データに含まれる第１インターレース符号化データを復号化して得られる第１インターレース復号化画像とを参照画像として少なくとも用いて予測することにより、奇数フィールドと偶数フィールドが前記第１インターレース画像とは逆位相である第２インターレース画像が符号化された第２インターレース符号化データを復号化して第２インターレース復号化画像を生成し、前記プログレッシブ符号化データに含まれる第１インターレース符号化画像を復号化した第１インターレース復号化画像と、生成された第２インターレース復号化画像とを合成することにより、プログレッシブ復号化画像を生成するプログレッシブ復号化部とを含む。 Yet another embodiment of the present invention is also a moving picture decoding apparatus. This apparatus encodes a first interlaced image obtained by encoding, from moving image encoded data, an odd field consisting of odd-numbered scanning lines and an even field consisting of even-numbered scanning lines alternately extracted at a predetermined frequency. A separation unit that separates and extracts one interlace encoded data and progressive encoded data obtained by encoding a progressive image including both odd-numbered and even-numbered scan lines; and decoding the first interlace encoded data A first decoding unit that generates a first interlaced decoded image; and a first interlaced encoded data included in the first interlaced decoded image and progressive encoded data decoded by the first decoding unit. A first interlaced decoded image obtained by using at least the reference image as a reference image Accordingly, the second interlaced encoded data obtained by decoding the second interlaced image in which the odd field and the even field are opposite in phase to the first interlaced image is decoded to generate a second interlaced decoded image, Progressive decoding that generates a progressive decoded image by combining the first interlace decoded image obtained by decoding the first interlace encoded image included in the progressive encoded data and the generated second interlace decoded image. Including

本発明のさらに別の態様は、動画像符号化方法である。この方法は、動画像から奇数番目の走査線からなる奇数フィールドと偶数番目の走査線からなる偶数フィールドが所定の周波数で交互に取り出された第１インターレース画像と、前記動画像から奇数フィールドと偶数フィールドが前記第１インターレース画像とは逆位相で取り出された第２インターレース画像とを取り出すステップと、前記第１インターレース画像を符号化して第１インターレース符号化データを生成するステップと、前記第１インターレース符号化データを復号化して得られる第１インターレース復号化画像を予測符号化の参照画像として少なくとも用いて前記第２インターレース画像を符号化することにより、第２インターレース符号化データを生成するステップとを含む。 Yet another aspect of the present invention is a video encoding method. This method includes a first interlaced image in which an odd field consisting of odd-numbered scanning lines and an even-numbered field consisting of even-numbered scanning lines are alternately extracted from a moving image at a predetermined frequency, and an odd field and an even number from the moving image. Extracting a second interlaced image whose field is extracted in an opposite phase to the first interlaced image; encoding the first interlaced image to generate first interlaced encoded data; and the first interlaced image Generating the second interlaced encoded data by encoding the second interlaced image using at least a first interlaced decoded image obtained by decoding the encoded data as a reference image for predictive encoding. Including.

本発明のさらに別の態様は、動画像復号化方法である。この方法は、動画像符号化データから、奇数番目の走査線からなる奇数フィールドと偶数番目の走査線からなる偶数フィールドが所定の周波数で交互に取り出された第１インターレース画像が符号化された第１インターレース符号化データと、奇数フィールドと偶数フィールドが前記第１インターレース画像とは逆位相である第２インターレース画像が符号化された第２インターレース符号化データとを分離して取り出すステップと、前記第１インターレース符号化データを復号化して第１インターレース復号化画像を生成するステップと、第１インターレース復号化画像を参照画像として少なくとも用いて予測することにより、前記第２インターレース符号化データを復号化して第２インターレース復号化画像を生成するステップと、前記第１インターレース復号化画像と前記第２インターレース復号化画像を合成することにより、プログレッシブ画像を生成するステップとを含む。 Yet another aspect of the present invention is a video decoding method. In this method, a first interlaced image obtained by encoding a first interlaced image obtained by alternately extracting an odd field composed of odd-numbered scanning lines and an even-numbered field composed of even-numbered scanning lines from moving image encoded data at a predetermined frequency is encoded. Separating and extracting one interlace encoded data and second interlace encoded data obtained by encoding a second interlace image in which an odd field and an even field are opposite in phase to the first interlace image; Decoding the interlaced encoded data to generate a first interlaced decoded image; and decoding the second interlaced encoded data by predicting at least using the first interlaced decoded image as a reference image Generating a second interlaced decoded image; By combining the one interlace decoded picture second interlace decoded picture, and generating a progressive image.

なお、以上の構成要素の任意の組合せ、本発明の表現を方法、装置、システム、記録媒体、コンピュータプログラムなどの間で変換したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above-described constituent elements and a conversion of the expression of the present invention between a method, an apparatus, a system, a recording medium, a computer program, etc. are also effective as an aspect of the present invention.

本発明によれば、動画像データのインターレース再生とプログレッシブ再生を両立させることができる。 According to the present invention, it is possible to achieve both interlaced playback and progressive playback of moving image data.

実施の形態１
図１は、実施の形態１に係る動画像符号化装置の構成図である。本実施の形態では動画を撮影して記録媒体に記録するという構成を説明するが、既に撮影された動画像のフレームを入力する形態であってもよく、符号化された動画像データを記録媒体に記録せずに、ストリームデータとして外部に出力する形態であってもよい。 Embodiment 1
FIG. 1 is a configuration diagram of a video encoding apparatus according to Embodiment 1. In this embodiment, a configuration in which a moving image is shot and recorded on a recording medium will be described. However, a frame of a moving image that has already been shot may be input, and encoded moving image data may be recorded on the recording medium. The data may be output to the outside as stream data without being recorded.

レンズ１０１を通して、動画用の受光素子１０２に画像が撮像される。受光素子１０２は例えばＣＣＤ（Charge Coupled Device）やＣＭＯＳなどであり、動画像撮像素子としてプログレッシブ画像として取り出し可能なものである。受光素子１０２から出力されるデータはＡ／Ｄ変換器１０３でＡ／Ｄ変換される。 An image is picked up by the moving image light receiving element 102 through the lens 101. The light receiving element 102 is, for example, a CCD (Charge Coupled Device) or a CMOS, and can be taken out as a progressive image as a moving image pickup element. Data output from the light receiving element 102 is A / D converted by the A / D converter 103.

Ａ／Ｄ変換器１０３からは、相補的（互いに補完的）な関係にある二種類のインターレース画像Ａ、Ｂがデジタルデータとして出力され、第１インターレース画像Ａは第１インターレース動画符号化器１０７に、第２インターレース画像Ｂは第２インターレース動画符号化器１０９に入力される。 The A / D converter 103 outputs two types of interlaced images A and B that are complementary (complementary to each other) as digital data, and the first interlaced image A is output to the first interlaced video encoder 107. The second interlaced image B is input to the second interlaced video encoder 109.

図２は、第１インターレース動画符号化器１０７および第２インターレース動画符号化器１０９に入力される相補的な二種類のインターレース画像Ａ、Ｂと、プログレッシブ画像Ｃの関係を説明する図である。 FIG. 2 is a diagram for explaining the relationship between the two complementary interlace images A and B and the progressive image C input to the first interlace moving image encoder 107 and the second interlace moving image encoder 109.

図２（ａ）は、第１インターレース動画符号化器１０７に入力される第１インターレース画像Ａを示す。約１／６０秒（正確には１／５９．９４秒）毎に白で示す走査線の画像（たとえば、横７２０画素、縦２４０画素）が入力される。１枚目は奇数番目の走査線からなる奇数フィールドの画像であり、２枚目は偶数番目の走査線からなる偶数フィールドの画像であり、以降、奇数フィールド、偶数フィールドが交互に繰り返す。 FIG. 2A shows a first interlaced image A input to the first interlaced video encoder 107. Scan line images (for example, horizontal 720 pixels and vertical 240 pixels) shown in white are input every approximately 1/60 seconds (exactly 1 / 59.94 seconds). The first sheet is an image of an odd field composed of odd-numbered scanning lines, and the second sheet is an image of an even-numbered field composed of even-numbered scanning lines. Thereafter, the odd-numbered field and even-numbered field are alternately repeated.

図２（ｂ）は、第２インターレース動画符号化器１０９に入力される第２インターレース画像Ｂを示す。同様に、約１／６０秒（正確には１／５９．９４秒）毎に白で示す走査線の画像（横７２０画素、縦２４０画素）が入力されるが、第２インターレース画像Ｂは第１インターレース画像Ａと相補的な関係にあり、第２インターレース画像Ｂでは、奇数フィールドと偶数フィールドが第１インターレース画像Ａとは逆位相になっている。すなわち、第２インターレース画像Ｂの１枚目は偶数番目の走査線からなる偶数フィールドの画像であり、２枚目は奇数番目の走査線からなる奇数フィールドの画像であり、以降、偶数フィールド、奇数フィールドが交互に繰り返す。 FIG. 2B shows the second interlaced image B input to the second interlaced video encoder 109. Similarly, a scanning line image (720 pixels wide and 240 pixels high) shown in white is input approximately every 1/60 seconds (accurately 1 / 59.94 seconds), but the second interlaced image B is The interlaced image A has a complementary relationship, and in the second interlaced image B, the odd field and the even field are in opposite phases to the first interlaced image A. That is, the first image of the second interlaced image B is an even field image composed of even-numbered scan lines, and the second image is an odd-field image composed of odd-numbered scan lines. The field repeats alternately.

図２（ａ）の第１インターレース画像Ａと図２（ｂ）の第２インターレース画像Ｂを合わせると、奇数ラインと偶数ラインで補完されるので、図２（ｃ）のようにプログレッシブ画像Ｃ（横７２０画素、縦４８０画素）が得られる。 When the first interlaced image A in FIG. 2 (a) and the second interlaced image B in FIG. 2 (b) are combined, they are complemented by odd and even lines, so that the progressive image C ( 720 pixels horizontally and 480 pixels vertically).

再び図１を参照する。第１インターレース動画符号化器１０７による第１インターレース画像Ａの符号化は、通常のＭＰＥＧビデオ等、インターレース対応の標準規格の符号化方式であれば、いずれの符号化方式を用いてもよい。ここでは、互換性を考慮して、標準規格で符号化しておくことがより好ましい。たとえば、ＭＰＥＧであれば、Ｈ２６４などの規格、あるいはＶＣ１であれば、ＤＶＤやブルーレイ・ディスク（商標または登録商標）などのメディア規格、あるいは、デジタル放送規格などで採用されている標準規格を用いることは、本実施の形態において互換性を意識した符号化を行う意味で重要である。 Refer to FIG. 1 again. The encoding of the first interlaced image A by the first interlaced video encoder 107 may be any encoding method as long as it is a standard encoding method compatible with interlace, such as ordinary MPEG video. Here, in consideration of compatibility, it is more preferable to perform encoding according to the standard. For example, in the case of MPEG, a standard such as H264 is used, or in the case of VC1, a media standard such as a DVD or a Blu-ray disc (trademark or registered trademark) or a standard standard adopted in a digital broadcasting standard is used. Is important in the sense of performing encoding in consideration of compatibility in the present embodiment.

動画局部復号器１０８は、第１インターレース動画符号化器１０７により符号化された第１インターレース画像Ａを復号化し、第１インターレース復号化画像Ａ’を生成する。動画局部復号器１０８は、第１インターレース復号化画像Ａ’を第２インターレース動画符号化器１０９に供給する。ここで、動画像符号化装置において符号化後の第１インターレース画像Ａをローカルに復号化するのは、後述の動画像復号化装置では、符号化前の第１インターレース画像Ａは利用不可能であり、第１インターレース復号化画像Ａ’を利用して第２インターレース画像Ｂを復号化しなければならないからである。 The moving image local decoder 108 decodes the first interlaced image A encoded by the first interlaced moving image encoder 107 to generate a first interlaced decoded image A ′. The moving image local decoder 108 supplies the first interlaced decoded image A ′ to the second interlaced moving image encoder 109. Here, the first interlaced image A after encoding is locally decoded in the moving image encoding device because the first interlaced image A before encoding cannot be used in the moving image decoding device described later. This is because the second interlaced image B must be decoded using the first interlaced decoded image A ′.

図１の第２インターレース動画符号化器１０９は、Ａ／Ｄ変換器１０３から供給された第２インターレース画像Ｂと、動画局部復号器１０８から供給された第１インターレース復号化画像Ａ’を用いて、第２インターレース画像Ｂを予測符号化する。第２インターレース動画符号化器１０９は、第２インターレース画像Ｂを予測符号化するにあたって、その符号化対象となっている第２インターレース画像Ｂを時系列的な基準として、過去の第２インターレース画像Ｂ以外に、動画局部復号器１０８から供給された第１インターレース復号化画像Ａ’も参照画像として利用する。第２インターレース動画符号化器１０９による符号化の詳細は後述する。 The second interlaced video encoder 109 of FIG. 1 uses the second interlaced image B supplied from the A / D converter 103 and the first interlaced decoded image A ′ supplied from the video local decoder 108. The second interlaced image B is predictively encoded. When the second interlaced video encoder 109 predictively encodes the second interlaced image B, the second interlaced video B is used as a time-series reference for the second interlaced image B in the past. In addition, the first interlace decoded image A ′ supplied from the moving image local decoder 108 is also used as a reference image. Details of encoding by the second interlaced video encoder 109 will be described later.

一方、オーディオ入力器１０４から出力されるオーディオデータはＡ／Ｄ変換器１０５にてデジタルデータに変換され、音声符号化器１０６に入力される。音声符号化器１０６は例えばＭＰＥＧオーディオ符号化やドルビーＡＣ３などの圧縮を行う。符号化方式の内容は規格化されているので、ここでは説明を省略する。 On the other hand, the audio data output from the audio input device 104 is converted to digital data by the A / D converter 105 and input to the speech encoder 106. The audio encoder 106 performs compression such as MPEG audio encoding and Dolby AC3. Since the content of the encoding method is standardized, description thereof is omitted here.

第１インターレース動画符号化器１０７から出力された第１インターレース符号化データと、第２インターレース動画符号化器１０９から出力された第２インターレース符号化データと、音声符号化器１０６から出力された音声符号化データは、それぞれ、多重化器１１０に入力される。 The first interlace encoded data output from the first interlace video encoder 107, the second interlace encoded data output from the second interlace video encoder 109, and the audio output from the audio encoder 106 The encoded data is input to the multiplexer 110, respectively.

多重化器１１０は、パケット多重化方式でこれらの３種類のデータを多重化する。多重化器１１０は、ピクチャのユーザデータを利用する方式のピクチャユーザデータ生成器に置き換えてもよい。どちらでもシステムとして構築することが可能である。多重化器１１０またはピクチャユーザデータ生成器の詳細は後述する。いずれにしても３種類の符号化データは１つのストリームデータにまとめられ、記録媒体書き込み器１１１によって、記録媒体１１２に記録される。 The multiplexer 110 multiplexes these three types of data using a packet multiplexing method. The multiplexer 110 may be replaced with a picture user data generator using a user data of a picture. Either can be built as a system. Details of the multiplexer 110 or the picture user data generator will be described later. In any case, the three types of encoded data are collected into one stream data and recorded on the recording medium 112 by the recording medium writer 111.

図３は、第１インターレース動画符号化器１０７および第２インターレース動画符号化器１０９による片方向予測符号化を説明する図である。 FIG. 3 is a diagram for explaining unidirectional predictive encoding by the first interlace moving image encoder 107 and the second interlace moving image encoder 109.

図３（ａ）は、第１インターレース動画符号化器１０７が第１インターレース画像Ａを片方向予測符号化する様子を示す。これはＭＰＥＧなどで使用される片方向予測符号化の説明図である。インターレース動画像であるため、横軸を時間軸とすると、１／６０秒（正確には１／５９．９４秒）毎に１枚のフィールド画像が入力される。 FIG. 3A shows a state in which the first interlaced video encoder 107 performs one-way predictive encoding of the first interlaced image A. This is an explanatory diagram of one-way predictive coding used in MPEG and the like. Since it is an interlaced moving image, one field image is input every 1/60 seconds (exactly 1 / 59.94 seconds) when the horizontal axis is the time axis.

第１インターレース画像Ａは、奇数フィールド、偶数フィールドの順にフィールド画像が時間方向に並んだものである。上段には奇数フィールド、下段には偶数フィールドの画像が図示されている。 The first interlaced image A is an image in which field images are arranged in the time direction in the order of odd fields and even fields. An odd field is shown in the upper row and an even field image is shown in the lower row.

まずは所定の単位でＭＰＥＧの１画面内で完結するイントラ符号化を行う。例えば３０フィールドに２枚（この２枚は、奇数フィールドと偶数フィールドを対にする）程度をイントラ符号化してＩピクチャとすることが適当である。それ以外はＰピクチャとして予測符号化を行う。Ｐピクチャを片方向予測符号化する際、イントラ符号化されたＩピクチャまたは過去のＰピクチャを参照して予測符号化する。 First, intra coding which is completed within one MPEG screen is performed in a predetermined unit. For example, it is appropriate to intracode about 2 frames in 30 fields (the two frames pair odd and even fields) into an I picture. Otherwise, predictive coding is performed as a P picture. When a P picture is unidirectionally predictive encoded, it is encoded with reference to an intra-coded I picture or a past P picture.

図３（ａ）には、ＭＰＥＧ２のメインプロファイルで規定された片方向予測符号化における参照関係が矢印で示されている。符号化のシンタックスはＭＰＥＧに規定されているので詳細は省略する。 In FIG. 3A, the reference relationship in the unidirectional predictive coding defined by the MPEG2 main profile is indicated by arrows. Since the encoding syntax is defined in MPEG, details are omitted.

たとえば、３番目の奇数フィールドのＰピクチャは、１番目の奇数フィールドのＩピクチャと２番目の偶数フィールドのＩピクチャを参照して順方向に予測符号化される。４番目の偶数フィールドのＰピクチャは、２番目の偶数フィールドのＩピクチャと３番目の奇数フィールドのＰピクチャを参照して順方向に予測符号化される。 For example, the P picture of the third odd field is predictively encoded with reference to the I picture of the first odd field and the I picture of the second even field. The fourth even field P picture is predictively encoded in the forward direction with reference to the second even field I picture and the third odd field P picture.

ここで予測符号化の際、画面で単純な差分をとるのではなく、画面内のマクロブロック（たとえば横１６画素、縦８画素、あるいは横１６画素、縦１６画素のブロック）の単位で、どれだけそのマクロブロックに含まれるコンテンツが動いたかを、１画素、あるいは１／２画素や１／４画素の精度で探索して、マクロブロック毎の差分が絶対値和あるいは絶対値２乗和などの評価値の一番小さいところを見つけ出し、動きベクトル（ＭＶ）の方向と動き量を求める。求めた動きベクトルにしたがってマクロブロックを動かし、予測参照画像との差分値をとり、符号化する。 Here, when performing predictive coding, instead of taking a simple difference on the screen, which one is a unit of a macroblock (for example, a block of 16 pixels horizontally, 8 pixels vertically, 16 pixels horizontally, 16 pixels vertically) If the content included in the macro block is moved with an accuracy of 1 pixel, 1/2 pixel, or 1/4 pixel, the difference for each macro block is an absolute value sum or an absolute value square sum. The smallest evaluation value is found and the direction of the motion vector (MV) and the amount of motion are obtained. The macro block is moved in accordance with the obtained motion vector, and a difference value with respect to the predicted reference image is taken and encoded.

マクロブロック毎に差分が最も小さくなる参照画像を選ぶため、選択された参照画像を特定するための識別情報をマクロブロック単位で設定し、動きベクトルの情報とともに符号化データ内に記録する。 In order to select a reference image with the smallest difference for each macroblock, identification information for specifying the selected reference image is set for each macroblock, and is recorded in the encoded data together with the motion vector information.

図３（ｂ）は、第２インターレース動画符号化器１０９が第２インターレース動画Ｂを片方向予測符号化する様子を示す。実線で示した第２インターレース動画像Ｂは、偶数フィールド、奇数フィールドの順に画像が時間方向に並んでおり、点線で示した第１インターレース動画像Ａと相補的な関係にある。 FIG. 3B shows a state in which the second interlace video encoder 109 unidirectionally encodes the second interlace video B. The second interlaced moving image B indicated by the solid line is arranged in the time direction in the order of the even field and the odd field, and has a complementary relationship with the first interlaced moving image A indicated by the dotted line.

通常のＭＰＥＧ符号化とは違い、第２インターレース動画符号化器１０９による第２インターレース画像Ｂの予測符号化は、第２インターレース動画像Ｂのフィールド画像だけでなく、第１インターレース動画像Ａの局部復号化フィールド画像も参照してなされる。第２インターレース動画符号化器１０９によって符号化される画像は、ＩピクチャでもＰピクチャでもないので、新たな画像タイプとして「Ｍピクチャ」と呼ぶことにする。 Unlike normal MPEG encoding, the predictive encoding of the second interlaced video B by the second interlaced video encoder 109 is not only the field image of the second interlaced video B but also the local part of the first interlaced video A. The decoding field image is also referred to. Since the image encoded by the second interlaced video encoder 109 is neither an I picture nor a P picture, it will be referred to as “M picture” as a new image type.

第２インターレース画像Ｂの１番目の偶数フィールドの画像はイントラ符号化されるため、Ｉピクチャとなる。第２インターレース画像Ｂの２番目の奇数フィールドの画像は、第１インターレース画像Ａの１番目の奇数フィールドの復号化画像Ｉ’と、第２インターレース画像Ｂの１番目の偶数フィールドの画像Ｉと、第１インターレース画像Ａの２番目の奇数フィールドの復号化画像Ｉ’の３種類を参照して予測符号化され、Ｍピクチャとなる。 Since the first even-field image of the second interlaced image B is intra-coded, it becomes an I picture. The image of the second odd field of the second interlaced image B includes the decoded image I ′ of the first odd field of the first interlaced image A, the image I of the first even field of the second interlaced image B, Predictive coding is performed with reference to three types of decoded images I ′ in the second odd field of the first interlaced image A, and an M picture is obtained.

第２インターレース画像Ｂの３番目の偶数フィールドの画像は、第１インターレース動画像Ａの２番目の偶数フィールドの復号化画像Ｉ’と、第２インターレース画像Ｂの２番目の奇数フィールドの画像Ｍと、第１インターレース画像Ａの３番目の奇数フィールドの復号化画像Ｐ’の３種類を参照して予測符号化され、Ｍピクチャとなる。 The image of the third even field of the second interlaced image B includes the decoded image I ′ of the second even field of the first interlaced video A, the image M of the second odd field of the second interlaced image B, and Then, prediction encoding is performed with reference to the three types of the decoded image P ′ of the third odd field of the first interlaced image A, and an M picture is obtained.

以降、第２インターレース画像Ｂのｋ（＞３）番目のフィールド画像は、第１インターレース画像Ａの（ｋ−１）番目の復号化フィールド画像Ｐ’と、第２インターレース画像Ｂの（ｋ−１）番目のフィールド画像Ｍと、第１インターレース画像Ａのｋ番目の復号化フィールド画像Ｐ’の３種類を参照して予測符号化され、Ｍピクチャとなる。 Thereafter, the k (> 3) th field image of the second interlaced image B is the (k−1) th decoded field image P ′ of the first interlaced image A and (k−1) of the second interlaced image B. ) The third field image M and the k-th decoded field image P ′ of the first interlaced image A are referred to and predictively encoded to form an M picture.

このように、時刻ｔの第２インターレース画像ＢのＭピクチャの片方向予測符号化に用いられる参照画像は、時刻（ｔ−１）の第１インターレース画像Ａの局部復号化フィールド画像、時刻（ｔ−１）の第２インターレース画像Ｂのフィールド画像Ｍ、および時刻ｔの第１インターレース画像の局部復号化フィールド画像の３枚であり、参照関係は３本の矢印で示されている。ここで、時刻の１単位は１／６０秒（正確には１／５９．９４秒）である。 Thus, the reference image used for the one-way predictive encoding of the M picture of the second interlaced image B at time t is the local decoded field image of the first interlaced image A at time (t−1), time (t -1) is a field image M of the second interlaced image B and a locally decoded field image of the first interlaced image at time t, and the reference relationship is indicated by three arrows. Here, one unit of time is 1/60 seconds (more precisely, 1 / 59.94 seconds).

第２インターレース画像Ｂの片方向予測符号化の際、第２インターレース画像Ｂのフィールド画像だけを参照画像とすると、図３（ａ）に示すように、１単位時間前か２単位時間前のフィールド画像しか利用できない。それに対して、本実施の形態では、第１インターレース画像Ａのフィールド画像も参照画像として採用し、時間的に直近の３枚のフィールド画像を参照して第２インターレース画像Ｂの予測符号化がなされる。そのため、予測誤差が小さくなることが期待され、符号化効率が高まる。 When only the field image of the second interlaced image B is used as a reference image in the one-way predictive coding of the second interlaced image B, as shown in FIG. Only images can be used. On the other hand, in the present embodiment, the field image of the first interlaced image A is also adopted as a reference image, and the second interlaced image B is predictively encoded with reference to the three most recent field images in terms of time. The Therefore, it is expected that the prediction error is reduced, and the encoding efficiency is increased.

予測符号化は図３（ａ）、図３（ｂ）の矢印で示した参照画像と予測画像の間で行われる。（１）複数の参照画像がある場合、差分値がもっとも小さい画像を選択する。（２）双方向予測の場合は未来、過去の２画像を選択して、平均した予測画像を使用して差分値をとったものも選択肢とする。（３）順方向、逆方向の２つの画像を使用して差分値をとったものも選択肢とする。これらはいずれもＭＰＥＧ２の予測符号化の方式であり、３つの方法の内、ひとつを用いてもよく、２つあるいは３つの方法を組み合わせて用いてもよい。複数の組み合わせがある場合には、その組み合わせの内、選択したものを示す識別情報をマクロブロックの予測方法を示すシンタックスビットの中に入れる。その後、求めた差分画像について横８画素、縦８画素のＤＣＴ変換を施し、さらに量子化する。量子化後のＤＣＴ変換係数を所定の順番にスキャンして、可変長符号化（ＶＬＣ）を行う。 Predictive coding is performed between the reference image and the predicted image indicated by the arrows in FIGS. 3 (a) and 3 (b). (1) When there are a plurality of reference images, an image having the smallest difference value is selected. (2) In the case of bi-directional prediction, an image obtained by selecting two future images in the future and taking a difference value using an averaged predicted image is also an option. (3) A difference value obtained using two images in the forward direction and the reverse direction is also an option. These are all MPEG2 predictive coding methods, and one of the three methods may be used, or two or three methods may be used in combination. When there are a plurality of combinations, identification information indicating a selected one of the combinations is put in syntax bits indicating a macroblock prediction method. Thereafter, the obtained difference image is subjected to DCT transform of 8 pixels in the horizontal direction and 8 pixels in the vertical direction, and further quantized. Variable length coding (VLC) is performed by scanning the quantized DCT transform coefficients in a predetermined order.

第２インターレース画像Ｂの予測符号化においては、最大３つの参照画像があるので、予測符号化にあたって使用した参照画像を特定するために２ビットの識別子を用いる。この識別ビットは、マクロブロック毎に変化できることになるので、マクロブロック毎に動きベクトルを格納する領域に識別ビットを記述するのが望ましい。 In the predictive encoding of the second interlaced image B, there are a maximum of three reference images, so a 2-bit identifier is used to identify the reference image used for predictive encoding. Since this identification bit can be changed for each macroblock, it is desirable to describe the identification bit in an area for storing a motion vector for each macroblock.

上記では、第１インターレース動画符号化器１０７、第２インターレース動画符号化器１０９がそれぞれ第１インターレース画像Ａ、第２インターレース画像Ｂを片方向予測符号化する場合を説明したが、双方向予測符号化してもよい。 In the above description, the case where the first interlace video encoder 107 and the second interlace video encoder 109 unidirectionally encode the first interlace image A and the second interlace image B has been described. May be used.

図４は、第１インターレース動画符号化器１０７および第２インターレース動画符号化器１０９による双方向予測符号化を説明する図である。 FIG. 4 is a diagram for explaining bi-directional predictive encoding by the first interlace moving image encoder 107 and the second interlace moving image encoder 109.

図４（ａ）は、第１インターレース動画符号化器１０７が第１インターレース動画Ａを双方向予測符号化する様子を示す。これはＭＰＥＧなどで使用される双方向予測符号化である。図３（ａ）との違いは、双方向予測されるＢピクチャーが存在していることである。インターレース画像であるため、過去と未来のフィールド画像を奇数および偶数のペアで参照することになり、過去、未来の４枚の参照画像を用いて予測符号化される。この方式はＭＰＥＧと同様であるため、詳細な説明は省略する。 FIG. 4A shows a state in which the first interlace video encoder 107 bi-directionally encodes the first interlace video A. This is bidirectional predictive coding used in MPEG and the like. The difference from FIG. 3A is that there is a B picture that is bidirectionally predicted. Since it is an interlaced image, the past and future field images are referred to by odd and even pairs, and prediction coding is performed using four past and future reference images. Since this method is the same as MPEG, detailed description is omitted.

図４（ｂ）は、第２インターレース動画符号化器１０９が第２インターレース動画Ｂを双方向予測符号化する様子を示す。 FIG. 4B shows a state in which the second interlace video encoder 109 performs bi-directional predictive encoding on the second interlace video B.

第２インターレース動画符号化器１０９による第２インターレース画像Ｂの双方向予測符号化は、直近の第１インターレース動画像Ａの局部復号化フィールド画像を参照してなされる。第２インターレース動画符号化器１０９によって双方向予測符号化される画像を新たな画像タイプとして「Ｎピクチャ」と呼ぶことにする。 Bidirectional predictive encoding of the second interlaced image B by the second interlaced video encoder 109 is performed with reference to the local decoded field image of the latest first interlaced video A. An image that is bi-directionally predictively encoded by the second interlace video encoder 109 will be referred to as an “N picture” as a new image type.

第２インターレース画像Ｂの１番目の偶数フィールドの画像はイントラ符号化されるため、Ｉピクチャとなる。第２インターレース画像Ｂの２番目の奇数フィールドの画像は、第１インターレース画像Ａの１番目の奇数フィールドの復号化画像Ｉ’と、第１インターレース画像Ａの２番目の偶数フィールドの復号化画像Ｉ’と、第１インターレース画像Ａの３番目の奇数フィールドの復号化画像Ｂ’の３種類を参照して予測符号化され、Ｎピクチャとなる。 Since the first even-field image of the second interlaced image B is intra-coded, it becomes an I picture. The second odd field image of the second interlaced image B includes the decoded image I ′ of the first odd field of the first interlaced image A and the decoded image I of the second even field of the first interlaced image A. 'And three types of decoded images B' of the third odd field of the first interlaced image A are predictively encoded to become N pictures.

第２インターレース画像Ｂの３番目の偶数フィールドの画像は、第１インターレース画像Ａの２番目の偶数フィールドの復号化画像Ｉ’と、第１インターレース画像Ａの３番目の奇数フィールドの復号化画像Ｂ’と、第１インターレース画像Ａの４番目の偶数フィールドの画像Ｂ’の３種類を参照して予測符号化され、Ｎピクチャとなる。 The third even field image of the second interlaced image B includes the decoded image I ′ of the second even field of the first interlaced image A and the decoded image B of the third odd field of the first interlaced image A. 'And the three types of the fourth even-field image B' of the first interlaced image A are subjected to predictive encoding to become an N picture.

以降、第２インターレース画像Ｂのｋ（＞３）番目のフィールド画像は、第１インターレース画像Ａの（ｋ−１）番目の復号化フィールド画像と、第１インターレース画像Ａのｋ番目の符号化フィールド画像と、第１インターレース画像Ａの（ｋ＋１）番目の復号化フィールド画像の３種類を参照して予測符号化され、Ｎピクチャとなる。 Thereafter, the k (> 3) th field image of the second interlaced image B is the (k-1) th decoded field image of the first interlaced image A and the kth encoded field of the first interlaced image A. Predictive encoding is performed with reference to three types of images and the (k + 1) -th decoded field image of the first interlaced image A, resulting in an N picture.

このように、時刻ｔの第２インターレース画像ＢのＮピクチャの双方向予測符号化に用いられる参照画像は、時刻（ｔ−１）の第１インターレース画像の局部復号化フィールド画像、時刻ｔの第１インターレース画像Ａの局部符号化フィールド画像、および時刻（ｔ＋１）の第１インターレース画像Ａの局部復号化フィールド画像の３枚であり、参照関係は３本の矢印で示されている。ここで、時刻の１単位は１／６０秒（正確には１／５９．９４秒）である。 Thus, the reference image used for bidirectional predictive coding of the N picture of the second interlaced image B at time t is the local decoded field image of the first interlaced image at time (t−1), the first image at time t. There are three local encoded field images of one interlaced image A and local decoded field images of the first interlaced image A at time (t + 1), and the reference relationship is indicated by three arrows. Here, one unit of time is 1/60 seconds (more precisely, 1 / 59.94 seconds).

第２インターレース画像Ｂの双方向予測符号化の際、第２インターレース画像Ｂのフィールド画像だけを参照画像とすると、図３（ａ）に示すように、１単位時間または２単位時間前後のフィールド画像しか利用できない。それに対して、本実施の形態では、第１インターレース画像Ａのフィールド画像を参照画像として採用し、過去、未来の双方向から３枚の近接したフィールド画像を参照して第２インターレース画像Ｂの予測符号化がなされる。そのため、予測誤差が小さくなり、符号化効率が向上する。 Assuming that only the field image of the second interlaced image B is used as a reference image during bi-directional predictive coding of the second interlaced image B, as shown in FIG. Only available. On the other hand, in the present embodiment, the field image of the first interlaced image A is adopted as the reference image, and the prediction of the second interlaced image B is performed by referring to the three adjacent field images from the past and the future. Encoding is done. As a result, the prediction error is reduced and the coding efficiency is improved.

図５は、多重化器／ピクチャユーザデータ生成器１１０が第１インターレース符号化データと第２インターレース符号化データをマージする方法を説明する図である。 FIG. 5 is a diagram illustrating a method in which the multiplexer / picture user data generator 110 merges the first interlace encoded data and the second interlace encoded data.

図５（ａ）は、ピクチャユーザデータ生成器１１０がユーザデータ方式によって二種類の符号化データをマージする方法を示す。ＭＰＥＧにはピクチャ単位でユーザデータを記述できる領域がある。ユーザデータを記述できる領域は、ピクチャレイヤに限らず、ＧＯＰレイヤなど他のレイヤにもあり、ＧＯＰレイヤのように複数のピクチャをまとめたレイヤであれば、第２インターレース符号化データを所定枚数だけ集めてユーザデータ領域に格納すればよい。ここでは、ピクチャー毎に第２インターレース符号化データを挿入する例を説明する。 FIG. 5A shows a method in which the picture user data generator 110 merges two types of encoded data according to a user data scheme. MPEG has an area where user data can be described in units of pictures. The area in which user data can be described is not limited to the picture layer, but also in other layers such as the GOP layer. If the layer is a group of a plurality of pictures such as the GOP layer, only a predetermined number of second interlace encoded data is stored. Collected and stored in the user data area. Here, an example in which the second interlace coded data is inserted for each picture will be described.

第１インターレース符号化データのピクチャレイヤのユーザデータ領域に、相補的な関係にある第２インターレース符号化データを１ピクチャ毎に挿入する。基本的にはピクチャヘッダから次のピクチャヘッダの手前までのデータを１枚のピクチャーデータとみなせばよいが、イントラピクチャ部分のＧＯＰの先頭にＧＯＰヘッダやシーケンスヘッダがある場合は、イントラピクチャの１ピクチャデータとして扱えばよい。 Second interlace encoded data having a complementary relationship is inserted for each picture in the user data area of the picture layer of the first interlace encoded data. Basically, the data from the picture header to the front of the next picture header may be regarded as one piece of picture data. However, if there is a GOP header or sequence header at the head of the GOP of the intra picture part, 1 of the intra picture is used. What is necessary is just to handle as picture data.

図５（ａ）のようにＭＰＥＧのピクチャデータの中のユーザデータの領域に第２インターレース符号化データを格納する。ユーザデータはＭＰＥＧ２のビデオレイヤのシンタックスにおけるuser_data()を使用する。user_data()は、user_start_codeという一意に決定できるバイトアラインされたスタートコードから始まり、次に0x000001の３バイトを受信するまでユーザデータを続けることができる。その際、他のアプリケーションでuser_data()を使用している可能性もあるので、user_data()のuser_start_codeの後、本方式のデータであることを示す、４バイト程度のユニークコード、例えば0x22220204を記述する。これにより他の用途で使用するユーザデータとの混同を防ぐことができる。 As shown in FIG. 5A, the second interlace coded data is stored in the user data area in the MPEG picture data. As user data, user_data () in the syntax of the video layer of MPEG2 is used. user_data () starts from a byte-aligned start code that can be uniquely determined as user_start_code, and can continue user data until the next 3 bytes of 0x000001 are received. At that time, there is a possibility that user_data () is used in other applications, so after user_start_code of user_data (), describe a unique code of about 4 bytes, for example 0x22220204, indicating that it is data of this method To do. This can prevent confusion with user data used for other purposes.

図５（ｂ）は、多重化器１１０がパケット多重化方式によって二種類の符号化データをマージする方法を示す。２種類のビデオ符号化データと１種類のオーディオ符号化データをＭＰＥＧ２のシステムレイヤ規格に準拠して多重化する。オーディオ符号化データ、第１インターレース符号化データ、および第２インターレース符号化データをパケットヘッダとともにパケット化する。それぞれのパケットヘッダには、オーディオ１、ビデオ１、ビデオ２のいずれであるかを識別するための識別ＩＤを記載する。ビデオ１の識別子は第１インターレース符号化データを示し、ビデオ２の識別子は第２インターレース符号化データを示し、オーディオ１の識別子は音声符号化データを示す。 FIG. 5B shows a method in which the multiplexer 110 merges two types of encoded data by a packet multiplexing method. Two types of video encoded data and one type of audio encoded data are multiplexed in accordance with the MPEG2 system layer standard. Audio encoded data, first interlace encoded data, and second interlace encoded data are packetized together with a packet header. In each packet header, an identification ID for identifying whether it is audio 1, video 1 or video 2 is described. The identifier of video 1 indicates first interlace encoded data, the identifier of video 2 indicates second interlace encoded data, and the identifier of audio 1 indicates audio encoded data.

２種類のビデオと１種類のオーディオの各復号器においてバッファオーバーフローが起きないように、これらの符号化データを一つのストリームに多重化する。ＭＰＥＧ２の多重化規格に詳細は記載されているので、ここでは説明を省略する。 These encoded data are multiplexed into one stream so that a buffer overflow does not occur in each decoder of two types of video and one type of audio. Since details are described in the MPEG2 multiplexing standard, the description is omitted here.

図６は、実施の形態１に係る動画像復号化装置の構成図である。記録媒体２０１は、図１の動画像符号化装置の記録媒体書き込み器１１１によって動画像と音声の符号化データが記録された記録媒体１１２である。記録媒体２０１には、第１インターレース符号化データ、第２インターレース符号化データ、および音声符号化データの３種類が一つのストリームにまとめられて記録されている。一つのストリームにまとめる方法として、図５（ｂ）のパケット多重化方式と図５（ａ）のピクチャデータのユーザデータ方式がある。 FIG. 6 is a configuration diagram of the video decoding apparatus according to Embodiment 1. The recording medium 201 is a recording medium 112 on which moving image and audio encoded data are recorded by the recording medium writer 111 of the moving image encoding device of FIG. The recording medium 201 records three types of first interlace encoded data, second interlace encoded data, and audio encoded data in one stream. As a method of grouping into one stream, there are a packet multiplexing method shown in FIG. 5B and a user data method of picture data shown in FIG.

記録媒体読み出し器２０２は、記録媒体２０１から符号化データを読み出し、多重化分離器／ユーザデータ分離器２０３に供給する。符号化データが図５（ｂ）のパケット多重化方式で記録されている場合、多重化分離器２０３は、パケット化されて多重化されたデータをパケットヘッダの識別ＩＤによって、オーディオ１、ビデオ１、ビデオ２のいずれであるかを識別する。多重化分離器２０３は、識別ＩＤがオーディオ１ならば音声符号化データ、ビデオ１であれば第１インターレース符号化データ、ビデオ２であれば第２インターレース符号化データであると判定し、それぞれの復号器である音声復号化器２０４、第１インターレース動画復号化器２０８、第２インターレース動画復号化器２０７に各符号化データを供給する。 The recording medium reader 202 reads the encoded data from the recording medium 201 and supplies it to the demultiplexer / user data separator 203. When the encoded data is recorded by the packet multiplexing method shown in FIG. 5B, the multiplexing / separating device 203 uses the packet header identification data for audio 1 and video 1 based on the packet header identification ID. , Video 2 is identified. The demultiplexer 203 determines that if the identification ID is audio 1, it is audio encoded data, if it is video 1, it is first interlace encoded data, and if it is video 2 it is second interlace encoded data. Each encoded data is supplied to the audio decoder 204, the first interlace video decoder 208, and the second interlace video decoder 207, which are decoders.

符号化データが図５（ａ）のユーザデータ方式で記録されている場合、ユーザデータ分離器２０３は、ユーザーデータ領域に挿入された第２インターレース符号化データを分離して抽出する。ユーザーデータは、ピクチャ単位でユーザデータを記述可能なピクチャレイヤに記述されているとは限らず、ピクチャのグループをまとめたＧＯＰレイヤに記述されていることもある。ピクチャユーザデータを用いた場合には、ＭＰＥＧ２のビデオレーヤのシンタックスにおけるuser_data()のuser_start_codeを検出し、user_start_codeの後に、本方式のデータであることを示す４バイト程度のユニークコード（たとえば0x22220204）を検出する。このユニークコードの後に格納された情報は、第２インターレース符号化データであると判断してデータを抽出することができる。 When the encoded data is recorded by the user data method of FIG. 5A, the user data separator 203 separates and extracts the second interlace encoded data inserted in the user data area. User data is not necessarily described in a picture layer in which user data can be described in units of pictures, but may be described in a GOP layer in which groups of pictures are collected. When picture user data is used, user_start_code of user_data () in the MPEG2 video layer syntax is detected, and a unique code of about 4 bytes (for example, 0x22220204) indicating the data of this method is detected after user_start_code. To detect. The information stored after this unique code can be extracted by determining that it is the second interlace encoded data.

第１インターレース動画復号化器２０８は、ＭＰＥＧなどのインターレース符号化されたデータを復号化する手段である。前述したように、動画像符号化装置の第１インターレース動画符号化器１０７が標準規格でインターレース符号化を実行しているため、第１インターレース動画復号化器２０８は、通常のＭＰＥＧビデオなどに相当するインターレース対応の標準規格の復号方式によって第１インターレース動画符号化データを復号化することができる。本実施の形態では互換性を保証する意味で、標準規格の符号化方式を採用することが重要である。 The first interlaced video decoder 208 is means for decoding interlace encoded data such as MPEG. As described above, since the first interlace moving picture encoder 107 of the moving picture coding apparatus performs the interlace coding according to the standard, the first interlace moving picture decoder 208 corresponds to a normal MPEG video or the like. The first interlaced video encoded data can be decoded by a standard decoding method that supports interlace. In this embodiment, it is important to adopt a standard encoding method in order to guarantee compatibility.

第２インターレース動画復号化器２０７は、第１インターレース動画復号化器２０８によって復号された第１インターレース復号化画像を参照画像として使用して、第２インターレース符号化データを復号化する。 The second interlace video decoder 207 decodes the second interlace encoded data using the first interlace decoded image decoded by the first interlace video decoder 208 as a reference image.

図３（ｂ）で説明したように、第２インターレース画像Ｂは、第１インターレース復号化画像も参照画像として用いて予測符号化されている。一般に、時刻ｔの第２インターレース画像ＢのＭピクチャは、時刻（ｔ−１）の第１インターレース復号化ピクチャＩ’（またはＰ’）、時刻（ｔ−１）のＭピクチャ、時刻ｔの第１インターレース復号化ピクチャＩ’（またはＰ’）の３枚の直近の参照画像から予測されている。第２インターレース動画復号化器２０７は、時刻ｔの第２インターレース画像ＢのＭピクチャを復号化するにあたり、第１インターレース動画復号化器２０８から時刻（ｔ−１）および時刻ｔの第１インターレース復号化画像の入力を受け、時刻（ｔ−１）の第１インターレース復号化画像、時刻ｔの第１インターレース復号化画像、時刻（ｔ−１）のＭピクチャの内、参照画像として指定されている画像に差分画像を加算することで時刻ｔのＭピクチャを復号化する。図４（ｂ）のように、双方向予測が用いられている場合は、時刻ｔのＮピクチャを復号化するにあたり、過去、未来の両方向から３つの近接した参照画像を用いて復号化する。 As described in FIG. 3B, the second interlaced image B is predictively encoded using the first interlaced decoded image as a reference image. In general, the M picture of the second interlaced image B at time t is the first interlace decoded picture I ′ (or P ′) at time (t−1), the M picture at time (t−1), and the first picture at time t. Predicted from the three most recent reference images of one interlaced decoded picture I ′ (or P ′). When the second interlace video decoder 207 decodes the M picture of the second interlace image B at time t, the second interlace video decoder 207 receives the first interlace decoding at time (t−1) and time t from the first interlace video decoder 208. Is received as a reference image from among the first interlaced decoded image at time (t-1), the first interlaced decoded image at time t, and the M picture at time (t-1). The M picture at time t is decoded by adding the difference image to the image. As shown in FIG. 4B, when bi-prediction is used, when decoding the N picture at time t, decoding is performed using three adjacent reference images from both the past and future directions.

第１インターレース動画復号化器２０８によって復号された第１インターレース復号化画像Ａ’は、Ｄ／Ａ変換器２１１およびプログレッシブ画像構成器２０９に供給される。Ｄ／Ａ変換器２１１によって第１インターレース復号化画像Ａ’はデジタルからアナログに変換され、画像出力器２１２にて画像出力される。 The first interlace decoded image A ′ decoded by the first interlace video decoder 208 is supplied to the D / A converter 211 and the progressive image composer 209. The first interlace decoded image A ′ is converted from digital to analog by the D / A converter 211, and the image output unit 212 outputs the image.

第２インターレース動画復号化器２０７によって復号された第２インターレース復号化画像Ｂ’はプログレッシブ画像構成器２０９に供給される。プログレッシブ画像構成器２０９は、第１インターレース復号化画像Ａ’と第２インターレース復号化画像Ｂ’を合成することにより、プログレッシブ復号化画像Ｃ’を生成し、Ｄ／Ａ変換器２１０に供給する。Ｄ／Ａ変換器２１０によってプログレッシブ復号化画像Ｃ’はデジタルからアナログに変換され、画像出力器２１２にて画像出力される。 The second interlace decoded image B ′ decoded by the second interlace video decoder 207 is supplied to the progressive image composer 209. The progressive image composer 209 generates a progressive decoded image C ′ by synthesizing the first interlace decoded image A ′ and the second interlace decoded image B ′ and supplies it to the D / A converter 210. The progressive decoded image C ′ is converted from digital to analog by the D / A converter 210, and the image output unit 212 outputs the image.

ここでＤ／Ａ変換器２１０、２１１は画像出力器２１２とは別に記載しているが、画像出力器２１２の中に搭載されていることもある。また、ＤＴＶなどの機器のように画像符号化されたデジタル信号がそのまま入力される場合もある。 Here, although the D / A converters 210 and 211 are described separately from the image output unit 212, they may be mounted in the image output unit 212. In some cases, an image-encoded digital signal is input as it is like a device such as a DTV.

一方、多重化分離器２０３から出力されたオーディオデータは音声復号化器２０４に入力される。音声復号化器２０４は、ＭＰＥＧオーディオ符号化やドルビーＡＣ３などにより符号化されたオーディオデータを復号化する。符号化方式の内容は規格化されているので、復号化の説明は省略する。復号化されたオーディオデータはＤ／Ａ変換器２０５にてデジタルからアナログに変換され、スピーカーなどの音声出力器２０６によって音声出力される。 On the other hand, the audio data output from the demultiplexer 203 is input to the speech decoder 204. The audio decoder 204 decodes audio data encoded by MPEG audio encoding or Dolby AC3. Since the content of the encoding method is standardized, description of decoding is omitted. The decoded audio data is converted from digital to analog by the D / A converter 205 and is output as audio by an audio output unit 206 such as a speaker.

第１インターレース動画復号化器２０８から出力される第１インターレース画像の情報のみを用いて、画像出力器２１２にて出力すれば、インターレース画像を再生することができ、第１インターレース動画復号化器２０８から出力される第１インターレース画像と第２インターレース動画復号化器２０７から出力される第２インターレース画像とをプログレッシブ画像構成器２０９を用いて合成し、プログレッシブ化して画像出力器２１２にて出力すれば、プログレッシブ画像を再生することができる。 If only the information of the first interlaced image output from the first interlaced video decoder 208 is used and output by the image output unit 212, the interlaced image can be reproduced, and the first interlaced video decoder 208 is used. The first interlaced image output from the second interlaced video decoder 207 and the second interlaced image output from the second interlaced video decoder 207 are synthesized using the progressive image composer 209, converted into a progressive signal, and output from the image output unit 212. Progressive images can be reproduced.

実施の形態２
図７は、実施の形態２に係る動画像符号化装置の構成図である。実施の形態１の第２インターレース動画符号化器１０９が実施の形態２ではプログレッシブ動画符号化器１１９に置き換えられおり、Ａ／Ｄ変換器１０３からプログレッシブ動画符号化器１１９へ供給される画像は、図２（ｃ）のプログレッシブ画像Ｃである点が実施の形態１とは異なる。これ以外は実施の形態１と同じであるから、共通する説明は省略し、異なる構成と動作について説明する。 Embodiment 2
FIG. 7 is a configuration diagram of a video encoding apparatus according to Embodiment 2. The second interlaced video encoder 109 of the first embodiment is replaced with a progressive video encoder 119 in the second embodiment, and the image supplied from the A / D converter 103 to the progressive video encoder 119 is The difference from Embodiment 1 is that it is a progressive image C in FIG. Since other than this is the same as in the first embodiment, common description is omitted, and different configurations and operations will be described.

プログレッシブ動画符号化器１１９は、動画局部復号器１０８から出力される第１インターレース復号化画像Ａ’と、Ａ／Ｄ変換器１０３から出力されるプログレッシブ画像Ｃを入力として受け取り、プログレッシブ画像Ｃを符号化してプログレッシブ符号化データを生成し、多重化器１１０に与える。 The progressive video encoder 119 receives the first interlace decoded image A ′ output from the video local decoder 108 and the progressive image C output from the A / D converter 103 as inputs, and encodes the progressive image C. Progressively encoded data is generated and provided to the multiplexer 110.

図８は、プログレッシブ動画符号化器１１９によるプログレッシブ画像Ｃの片方向予測符号化を説明する図である。 FIG. 8 is a diagram for explaining unidirectional predictive encoding of a progressive image C by the progressive video encoder 119.

図８（ａ）は、比較のため、第１インターレース動画符号化器１０７によるインターレース画像Ａの片方向予測符号化を示したものであり、これは、実施の形態１と同じである。 For comparison, FIG. 8A shows the one-way predictive encoding of the interlaced image A by the first interlaced video encoder 107, which is the same as in the first embodiment.

図８（ｂ）は、プログレッシブ動画符号化器１１９によるプログレッシブ画像Ｃの片方向予測符号化を示す。図８（ｂ）には説明の便宜上、二種類のストリームデータが図示されている。上段のストリームデータでは、第２インターレース画像ＢのＭピクチャが、過去のＭピクチャの他、第１インターレース復号化画像Ａ’のフィールド画像Ｉ’、Ｐ’も参照画像として利用して予測符号化される様子が示されており、ここまでは実施の形態１と同じである。 FIG. 8B shows unidirectional predictive encoding of the progressive image C by the progressive video encoder 119. FIG. 8B shows two types of stream data for convenience of explanation. In the upper stream data, the M picture of the second interlaced image B is predictively encoded using the past M pictures as well as the field images I ′ and P ′ of the first interlaced decoded image A ′ as reference images. This is the same as in the first embodiment.

実施の形態２では、プログレッシブ動画符号化器１１９にはプログレッシブ画像Ｃが与えられることから、元の第１インターレース画像Ａもさらなる参照画像として利用できる。図８（ｂ）の下段のストリームデータは、第２インターレース画像ＢのＭピクチャが、元の第１インターレース画像Ａのフィールド画像Ｉ、Ｐも参照画像として利用して予測符号化されることを示している。図８（ｂ）の上段と下段を合わせると、第２インターレース画像ＢのＭピクチャは、合計５枚の参照画像を用いて予測符号化することができる。 In the second embodiment, since the progressive video encoder 119 is provided with the progressive image C, the original first interlaced image A can also be used as a further reference image. The lower stream data in FIG. 8B indicates that the M picture of the second interlaced image B is predictively encoded using the field images I and P of the original first interlaced image A as reference images. ing. When the upper stage and the lower stage of FIG. 8B are combined, the M picture of the second interlaced image B can be predictively encoded using a total of five reference images.

このように、プログレッシブ動画符号化器１１９は、符号化対象となっている第２インターレース画像を時系列的な基準として、過去のＭピクチャ以外に、動画局部復号器１０８によって生成された第１インターレース復号化画像Ａ’と、Ａ／Ｄ変換器１０３から供給されるプログレッシブ画像Ｃに含まれる第１インターレース画像Ａを参照画像として用い、第２インターレース画像ＢのＭピクチャを予測し、プログレッシブ画像Ｃを符号化する。プログレッシブ動画符号化器１１９は、図８（ｂ）の符号化されたＩピクチャ、Ｐピクチャ、Ｍピクチャを、プログレッシブの状態にして出力する。たとえば、相補的な関係にある奇数フィールドのＰピクチャと偶数フィールドのＭピクチャを合成してプログレッシブ画像にして出力する。 As described above, the progressive video encoder 119 uses the second interlaced image to be encoded as a time-series reference, and uses the first interlace generated by the video local decoder 108 in addition to the past M pictures. Using the decoded image A ′ and the first interlaced image A included in the progressive image C supplied from the A / D converter 103 as a reference image, the M picture of the second interlaced image B is predicted, and the progressive image C is Encode. The progressive video encoder 119 outputs the encoded I picture, P picture, and M picture shown in FIG. 8B in a progressive state. For example, a P picture of an odd field and an M picture of an even field, which are in a complementary relationship, are synthesized and output as a progressive image.

図９は、プログレッシブ動画符号化器１１９によるプログレッシブ画像Ｃの双方向予測符号化を説明する図である。 FIG. 9 is a diagram for explaining bidirectional predictive encoding of a progressive image C by the progressive video encoder 119.

図９（ａ）は、比較のため、第１インターレース動画符号化器１０７によるインターレース画像Ａの双方向予測符号化を示したものであり、これは、実施の形態１と同じである。 For comparison, FIG. 9A shows bi-directional predictive encoding of the interlaced image A by the first interlaced video encoder 107, which is the same as in the first embodiment.

図９（ｂ）は、プログレッシブ動画符号化器１１９によるプログレッシブ画像Ｃの双方向予測符号化を示す。図９（ｂ）には説明の便宜上、二種類のストリームデータが図示されている。上段のストリームデータでは、第２インターレース画像ＢのＮピクチャが、過去、現在、未来の第１インターレース復号化画像Ａ’のフィールド画像を参照画像として利用して予測符号化される様子が示されており、ここまでは実施の形態１と同じである。 FIG. 9B shows bidirectional predictive encoding of the progressive image C by the progressive video encoder 119. FIG. 9B shows two types of stream data for convenience of explanation. The upper stream data shows that N pictures of the second interlaced image B are predictively encoded using the field images of the first interlaced decoded image A ′ of the past, current, and future as reference images. The steps up to here are the same as those in the first embodiment.

実施の形態２では、プログレッシブ動画符号化器１１９にはプログレッシブ画像Ｃが与えられることから、元の第１インターレース画像Ａもさらなる参照画像として利用できる。図９（ｂ）の下段のストリームデータは、第２インターレース画像ＢのＮピクチャが、元の第１インターレース画像Ａの過去、現在、未来のフィールド画像を参照画像として利用して予測符号化されることを示している。図９（ｂ）の上段と下段を合わせると、第２インターレース画像ＢのＮピクチャは、合計６枚の参照画像を用いて予測符号化することができる。 In the second embodiment, since the progressive video encoder 119 is provided with the progressive image C, the original first interlaced image A can also be used as a further reference image. In the lower stream data of FIG. 9B, the N picture of the second interlaced image B is predictively encoded using the past, current, and future field images of the original first interlaced image A as reference images. It is shown that. When the upper stage and the lower stage in FIG. 9B are combined, the N pictures of the second interlaced image B can be predictively encoded using a total of six reference images.

このように、プログレッシブ動画符号化器１１９は、動画局部復号器１０８によって生成された第１インターレース復号化画像Ａ’と、Ａ／Ｄ変換器１０３から供給されるプログレッシブ画像Ｃに含まれる第１インターレース画像Ａを参照画像として用い、第２インターレース画像ＢのＮピクチャを予測し、プログレッシブ画像Ｃを符号化する。プログレッシブ動画符号化器１１９は、図９（ｂ）の符号化されたＩピクチャ、Ｐピクチャ、Ｂピクチャ、Ｎピクチャを、プログレッシブの状態にして出力する。たとえば、相補的な関係にある奇数フィールドのＢピクチャと偶数フィールドのＮピクチャを合成してプログレッシブ画像にして出力する。 As described above, the progressive video encoder 119 includes the first interlace decoded image A ′ generated by the video local decoder 108 and the first interlace included in the progressive image C supplied from the A / D converter 103. Using the image A as a reference image, the N picture of the second interlaced image B is predicted, and the progressive image C is encoded. The progressive video encoder 119 outputs the encoded I picture, P picture, B picture, and N picture of FIG. 9B in a progressive state. For example, a B picture of an odd field and an N picture of an even field, which are in a complementary relationship, are synthesized and output as a progressive image.

図１０は、実施の形態２に係る動画像復号化装置の構成図である。実施の形態１の第２インターレース動画復号化器２０７が実施の形態２ではプログレッシブ動画復号化器２１７に置き換えられており、多重化分離器２０３からプログレッシブ動画復号化器２１７へ供給される画像は、図２（ｃ）のプログレッシブ画像Ｃが符号化されたプログレッシブ符号化データである。実施の形態２では、実施の形態１のプログレッシブ画像構成器２０９は不要であり、プログレッシブ動画復号化器２１７の出力は直接、Ｄ／Ａ変換器２１０に与えられる。これ以外は実施の形態１と同じであるから、共通する説明は省略し、異なる構成と動作について説明する。 FIG. 10 is a configuration diagram of a moving picture decoding apparatus according to Embodiment 2. The second interlaced video decoder 207 of the first embodiment is replaced with a progressive video decoder 217 in the second embodiment, and an image supplied from the demultiplexer 203 to the progressive video decoder 217 is: This is progressive encoded data obtained by encoding the progressive image C in FIG. In the second embodiment, the progressive image composer 209 of the first embodiment is unnecessary, and the output of the progressive video decoder 217 is directly given to the D / A converter 210. Since other than this is the same as in the first embodiment, common description is omitted, and different configurations and operations will be described.

プログレッシブ動画復号化器２１７は、第１インターレース動画復号化器２０８によって復号された第１インターレース復号化画像の入力を受けて、プログレッシブ符号化データを復号化し、プログレッシブ画像を出力する。 The progressive video decoder 217 receives the input of the first interlace decoded image decoded by the first interlace video decoder 208, decodes the progressive encoded data, and outputs a progressive image.

図８（ｂ）で説明したように、第２インターレース画像Ｂは、第１インターレース復号化画像とプログレッシブ画像に含まれる第１インターレース画像を参照画像として用いて予測符号化されている。一般に、時刻ｔの第２インターレース画像ＢのＭピクチャは、時刻（ｔ−１）の第１インターレース復号化ピクチャＩ’（またはＰ’）、時刻（ｔ−１）のＭピクチャ、時刻ｔの第１インターレース復号化ピクチャＩ’（またはＰ’）、時刻（ｔ−１）のプログレッシブ画像に含まれる第１インターレースピクチャＩ（またはＰ）、および時刻ｔのプログレッシブ画像に含まれる第１インターレースピクチャＩ（またはＰ）の５枚の直近の参照画像から予測されている。プログレッシブ動画復号化器２１７は、時刻ｔの第２インターレース画像ＢのＭピクチャを復号化するにあたり、第１インターレース動画復号化器２０８から時刻（ｔ−１）および時刻ｔの第１インターレース復号化ピクチャＩ’（またはＰ’）の入力を受け、多重化分離器２０３から供給されるプログレッシブ符号化データに含まれる第１インターレース符号化データを復号化して得られる第１インターレースピクチャＩ（またはＰ）を取得し、これらのピクチャの内、参照画像として指定されている画像に差分画像を加算することで時刻ｔのＭピクチャを復号化する。プログレッシブ動画復号化器２１７は、復号化されたＩピクチャ、Ｐピクチャ、Ｍピクチャを、プログレッシブの状態にして出力する。たとえば、相補的な関係にある奇数フィールドのＰピクチャと偶数フィールドのＭピクチャを合成してプログレッシブ画像にして出力する。図９（ｂ）のように、双方向予測が用いられている場合は、時刻ｔのＮピクチャを復号化するにあたり、過去、未来の両方向から６つの近接した参照画像を用いて復号化する。プログレッシブ動画復号化器２１７は、復号化されたＩピクチャ、Ｐピクチャ、Ｂピクチャ、Ｎピクチャを、プログレッシブの状態にして出力する。たとえば、相補的な関係にある奇数フィールドのＢピクチャと偶数フィールドのＮピクチャを合成してプログレッシブ画像にして出力する。 As described in FIG. 8B, the second interlaced image B is predictively encoded using the first interlaced decoded image and the first interlaced image included in the progressive image as a reference image. In general, the M picture of the second interlaced image B at time t is the first interlace decoded picture I ′ (or P ′) at time (t−1), the M picture at time (t−1), and the first picture at time t. 1 interlaced decoded picture I ′ (or P ′), first interlaced picture I (or P) included in the progressive picture at time (t−1), and first interlaced picture I (included in the progressive picture at time t) Or P) is predicted from the five latest reference images. When the progressive video decoder 217 decodes the M picture of the second interlaced image B at time t, the progressive video decoder 217 receives the first interlaced decoded picture at time (t−1) and time t from the first interlaced video decoder 208. The first interlace picture I (or P) obtained by decoding the first interlace encoded data included in the progressive encoded data supplied from the demultiplexer 203 in response to the input of I ′ (or P ′) Obtaining and decoding the M picture at time t by adding the difference image to the image designated as the reference image among these pictures. The progressive video decoder 217 outputs the decoded I picture, P picture, and M picture in a progressive state. For example, a P picture of an odd field and an M picture of an even field, which are in a complementary relationship, are synthesized and output as a progressive image. When bi-prediction is used as shown in FIG. 9B, when decoding the N picture at time t, decoding is performed using six adjacent reference images from both the past and future directions. The progressive video decoder 217 outputs the decoded I picture, P picture, B picture, and N picture in a progressive state. For example, a B picture of an odd field and an N picture of an even field, which are in a complementary relationship, are synthesized and output as a progressive image.

動画像符号化装置において、実施の形態１の第２インターレース動画符号化器１０９の機能と、実施の形態２のプログレッシブ動画符号化器１１９の機能とを併せ持つ第２動画符号化器を設けてもよい。Ａ／Ｄ変換器１０３から第２インターレース画像Ｂが入力される場合は、第２動画符号化器は第２インターレース動画符号化器１０９の機能を実行し、Ａ／Ｄ変換器１０３からプログレッシブ画像Ｃが入力される場合は、第２動画符号化器はプログレッシブ動画符号化器１１９の機能を実行する。 In the moving image encoding apparatus, a second moving image encoder having both the function of the second interlaced moving image encoder 109 of the first embodiment and the function of the progressive moving image encoder 119 of the second embodiment may be provided. Good. When the second interlaced image B is input from the A / D converter 103, the second moving image encoder executes the function of the second interlaced moving image encoder 109, and the A / D converter 103 performs the progressive image C. Is input, the second video encoder performs the function of the progressive video encoder 119.

同様に、動画像復号化装置において、実施の形態１の第２インターレース動画復号化器２０７の機能と、実施の形態２のプログレッシブ動画復号化器２１７の機能とを併せ持つ第２動画復号化器を設けてもよい。多重化分離器２０３から符号化された第２インターレース画像Ｂが入力される場合は、第２動画復号化器は第２インターレース動画復号化器２０７の機能を実行し、多重化分離器２０３から符号化されたプログレッシブ画像Ｃが入力される場合は、第２動画復号化器はプログレッシブ動画復号化器２１７の機能を実行する。 Similarly, in the video decoding device, a second video decoder having both the function of the second interlace video decoder 207 of the first embodiment and the function of the progressive video decoder 217 of the second embodiment is provided. It may be provided. When the second interlaced image B encoded from the demultiplexer / separator 203 is input, the second moving picture decoder executes the function of the second interlaced moving picture decoder 207, When the converted progressive picture C is input, the second moving picture decoder executes the function of the progressive moving picture decoder 217.

以上、本発明を実施の形態をもとに説明した。実施の形態は例示であり、それらの各構成要素や各処理プロセスの組合せにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on the embodiments. The embodiments are exemplifications, and it will be understood by those skilled in the art that various modifications can be made to combinations of the respective constituent elements and processing processes, and such modifications are within the scope of the present invention. .

本実施の形態では、記録媒体の種類を特に限定しない。例えばハードディスクや光ディスク、メモリー、テープなど、どんな記録媒体であってもよい。また、記録媒体にデータを記録しなくても、通信、放送などあらゆる伝送媒体を経由してデータを送信することが可能で、その場合には、記録装置は伝送装置として使用することもできる。また再生装置は受信装置として使用することも可能である。また、媒体という定義はデータを記録できる媒体という、狭義の媒体だけでなく、信号データを伝送するための電磁波、光などを含む。また、記録媒体に記録されている情報は、記録されていない状態での電子ファイルなどのデータ自身を含むものとする。 In the present embodiment, the type of recording medium is not particularly limited. For example, any recording medium such as a hard disk, an optical disk, a memory, and a tape may be used. In addition, data can be transmitted via any transmission medium such as communication and broadcasting without recording data on the recording medium. In that case, the recording apparatus can also be used as a transmission apparatus. The playback device can also be used as a receiving device. The definition of medium includes not only a narrowly-defined medium that can record data, but also electromagnetic waves and light for transmitting signal data. The information recorded on the recording medium includes data itself such as an electronic file in an unrecorded state.

実施の形態１に係る動画像符号化装置の構成図である。1 is a configuration diagram of a video encoding apparatus according to Embodiment 1. FIG. 相補的な二つのインターレース画像とプログレッシブ画像の関係を説明する図である。It is a figure explaining the relationship between two complementary interlaced images and progressive images. 図１の第１インターレース動画符号化器および第２インターレース動画符号化器による片方向予測符号化を説明する図である。It is a figure explaining the one-way prediction encoding by the 1st interlace moving image encoder of FIG. 1, and a 2nd interlace moving image encoder. 図１の第１インターレース動画符号化器および第２インターレース動画符号化器による双方向予測符号化を説明する図である。It is a figure explaining the bi-directional predictive encoding by the 1st interlace moving image encoder of FIG. 1, and a 2nd interlace moving image encoder. 図１の多重化器／ピクチャユーザデータ生成器が第１インターレース符号化データと第２インターレース符号化データをマージする方法を説明する図である。FIG. 3 is a diagram illustrating a method in which the multiplexer / picture user data generator of FIG. 1 merges first interlace encoded data and second interlace encoded data. 実施の形態１に係る動画像復号化装置の構成図である。1 is a configuration diagram of a moving picture decoding apparatus according to Embodiment 1. FIG. 実施の形態２に係る動画像符号化装置の構成図である。[Fig. 10] Fig. 10 is a configuration diagram of a video encoding device according to Embodiment 2. 図７のプログレッシブ動画符号化器によるプログレッシブ画像の片方向予測符号化を説明する図である。It is a figure explaining the unidirectional prediction encoding of the progressive image by the progressive moving image encoder of FIG. 図７のプログレッシブ動画符号化器によるプログレッシブ画像の双方向予測符号化を説明する図である。It is a figure explaining the bidirectional | two-way predictive coding of the progressive image by the progressive moving image encoder of FIG. 実施の形態２に係る動画像復号化装置の構成図である。It is a block diagram of the moving image decoding apparatus which concerns on Embodiment 2. 従来のＭＰＥＧ符号化装置の構成図である。It is a block diagram of the conventional MPEG encoding apparatus. 従来のＭＰＥＧ復号化装置の構成図である。It is a block diagram of the conventional MPEG decoding apparatus.

Explanation of symbols

１０６音声符号化器、１０７第１インターレース動画符号化器、１０８動画局部復号器、１０９第２インターレース動画符号化器、１１０多重化器、１１９プログレッシブ動画符号化器、２０３多重化分離器、２０４音声復号化器、２０７第２インターレース動画復号化器、２０８第１インターレース動画復号化器、２０９プログレッシブ画像構成器、２１７プログレッシブ動画復号化器。 106 audio encoder, 107 first interlace video encoder, 108 video local decoder, 109 second interlace video encoder, 110 multiplexer, 119 progressive video encoder, 203 demultiplexer, 204 audio A decoder; 207 a second interlace video decoder; 208 a first interlace video decoder; 209 progressive image composer; and 217 progressive video decoder.

Claims

A first interlaced image in which odd fields consisting of odd-numbered scanning lines and even-numbered fields consisting of even-numbered scanning lines are alternately extracted from a moving image at a predetermined frequency, and odd-numbered fields and even-numbered fields are extracted from the moving image. An extraction unit that extracts a second interlaced image extracted in an opposite phase to the one interlaced image;
A first encoding unit that encodes the first interlaced image to generate first interlaced encoded data;
The second interlaced encoded data is generated by encoding the second interlaced image using at least the first interlaced decoded image obtained by decoding the first interlaced encoded data as a reference image for predictive encoding. And a second encoding unit.

The second encoding unit, when predictively encoding the second interlaced image currently being encoded, uses the second interlaced image that is the encoding target as a time series reference, Three types of interlaced decoded images, past second interlaced images, and current first interlaced decoded images are used as reference images for unidirectional predictive coding, and identifiers for identifying the three types of reference images are used as first identifiers. The moving picture coding apparatus according to claim 1, wherein the moving picture coding apparatus is included in the two-interlace coded data.

The second encoding unit, when predictively encoding the second interlaced image currently being encoded, uses the second interlaced image that is the encoding target as a time series reference, An identifier for identifying the three types of reference images by using three types of interlace decoded images, a current first interlace decoded image, and a future first interlace decoded image as reference images for bidirectional predictive coding. The moving picture encoding apparatus according to claim 1, wherein the second interlace encoded data is included in the second interlace encoded data.

Progressive including a first interlaced image in which an odd field composed of odd-numbered scan lines and an even-numbered field composed of even-numbered scan lines are alternately extracted from a moving image at a predetermined frequency, and both odd-numbered and even-numbered scan lines. An extractor for extracting images;
A first encoding unit that encodes the first interlaced image to generate first interlaced encoded data;
Using at least a first interlace decoded image obtained by decoding the first interlace encoded data and a first interlace image included in a progressive image supplied from the extraction unit as a reference image for predictive encoding, The second interlaced image is generated by encoding the second interlaced image in which the odd field and the even field are opposite in phase to the first interlaced image, and the first interlaced image included in the progressive image is encoded. And a progressive encoding unit that generates progressive encoded data by combining the generated first interlace encoded data and the generated second interlace encoded data. apparatus.

The progressive encoding unit, when predictively encoding the second interlaced image that is the current encoding target, uses the second interlaced image that is the encoding target as a chronological reference to the past first interlaced image. Five types of decoded images, past second interlaced images, past progressive images, current first interlaced decoded images, and current progressive images are used as reference images for unidirectional predictive coding, and the five types of references are used. 5. The moving image encoding apparatus according to claim 4, wherein an identifier for identifying an image is included in the progressive encoded data.

The progressive encoding unit, when predictively encoding the second interlaced image that is the current encoding target, uses the second interlaced image that is the encoding target as a chronological reference to the past first interlaced image. Six types of decoded images, past progressive images, current first interlace decoded images, current progressive images, future first interlace decoded images, and future progressive images are used as reference images for bidirectional predictive coding. 5. The moving picture coding apparatus according to claim 4, wherein an identifier for identifying the six types of reference pictures is included in the progressive coded data.

First interlaced encoding in which a first interlaced image in which odd fields consisting of odd-numbered scanning lines and even-numbered fields consisting of even-numbered scanning lines are alternately extracted at a predetermined frequency from encoded video data A separation unit that separates and extracts data and second interlace encoded data obtained by encoding a second interlace image in which an odd field and an even field are opposite in phase to the first interlace image;
A first decoding unit that decodes the first interlace encoded data to generate a first interlace decoded image;
A second decoding unit that generates a second interlaced decoded image by decoding the second interlaced encoded data by predicting at least using the first interlaced decoded image as a reference image;
A moving picture decoding apparatus comprising: a progressive image generating unit that generates a progressive image by combining the first interlace decoded image and the second interlace decoded image.

First interlaced encoding in which a first interlaced image in which odd fields consisting of odd-numbered scanning lines and even-numbered fields consisting of even-numbered scanning lines are alternately extracted at a predetermined frequency from encoded video data A separation unit that separates and extracts data and progressive encoded data obtained by encoding a progressive image including both odd-numbered and even-numbered scanning lines;
A first decoding unit that decodes the first interlace encoded data to generate a first interlace decoded image;
At least the first interlace decoded image decoded by the first decoding unit and the first interlace decoded image obtained by decoding the first interlace encoded data included in the progressive encoded data are used as reference images. The second interlaced encoded data obtained by encoding the second interlaced image in which the odd field and the even field are opposite in phase to the first interlaced image is generated to generate a second interlaced decoded image. Then, a progressive decoded image is generated by combining the first interlace decoded image obtained by decoding the first interlace encoded image included in the progressive encoded data and the generated second interlace decoded image. Including a progressive decoding unit Moving picture decoding apparatus according to symptoms.

A first interlaced image in which odd fields consisting of odd-numbered scanning lines and even-numbered fields consisting of even-numbered scanning lines are alternately extracted from a moving image at a predetermined frequency, and odd-numbered fields and even-numbered fields are extracted from the moving image. Extracting a second interlaced image taken out of phase with the one interlaced image;
Encoding the first interlaced image to generate first interlaced encoded data;
The second interlaced encoded data is generated by encoding the second interlaced image using at least the first interlaced decoded image obtained by decoding the first interlaced encoded data as a reference image for predictive encoding. A video encoding method comprising the steps of:

First interlaced encoding in which a first interlaced image in which odd fields consisting of odd-numbered scanning lines and even-numbered fields consisting of even-numbered scanning lines are alternately extracted at a predetermined frequency from encoded video data Separating and extracting data and second interlace encoded data obtained by encoding a second interlaced image in which odd and even fields are in opposite phases to the first interlaced image;
Decoding the first interlace encoded data to generate a first interlace decoded image;
Decoding the second interlace coded data to generate a second interlace decoded image by predicting at least using the first interlace decoded image as a reference image;
A method for decoding a moving image, comprising: generating a progressive image by combining the first interlaced decoded image and the second interlaced decoded image.