JP2004023288A

JP2004023288A - Preprocessing system for moving image encoding

Info

Publication number: JP2004023288A
Application number: JP2002173141A
Authority: JP
Inventors: Hitoshi Naito; 内藤　整; Koichi Takagi; 高木　幸一; Masahiro Wada; 和田　正裕; Shuichi Matsumoto; 松本　修一; Koichi Ishihara; 石原　剛一
Original assignee: KDDI R&D Laboratories Inc
Current assignee: KDDI Research Inc
Priority date: 2002-06-13
Filing date: 2002-06-13
Publication date: 2004-01-22

Abstract

<P>PROBLEM TO BE SOLVED: To obtain information about visibility precedence which highly accurately describes human's visual characteristics on a moving image. <P>SOLUTION: Objects in the screen of an input image are extracted (S1), and gazing degree parameters V(j) of each of the extracted objects are calculated (S2). Also, a texture attributes parameter t(k) of each macroblock contained in the object is determined (S3). A visibility precedence parameter w(k) for each macroblock is calculated by using the parameters V(j) and t(k). This parameter w(k) is output to an adaptive quantizer and is used for determination of a quantized parameter. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、動画像符号化のための前処理方式に関し、特に、動画像に対する人間の視覚特性を高精度に表す視覚優先度情報を得ることができる動画像符号化のための前処理方式に関するものである。
【０００２】
【従来の技術】
動画像符号化の国際標準であるＭＰＥＧ−２の符号化検証モデルＴｅｓｔ　Ｍｏｄｅｌ５［１］においては、図５にその概念を示すよう、マクロブロックごとの発生ビット数を入力し、目標符号化ビット数を出力する仮想バッファを導入し、発生ビット数の目標符号化ビット数に対する過不足をこの仮想バッファに蓄積し、量子化制御にフィードバックする。
【０００３】
ここではマクロブロック（１６×１６画素）ごとの重み係数、すなわち視覚感度ｗ（ｋ）を下記（１）式により算出し、量子化パラメータを、視覚感度ｗ（ｋ）と仮想バッファの占有量ｄとを用いて下記（２）式により算出する。なお、ピクチャの目標ビット数をＴ、ピクチャ内マクロブロック数をＭＢｃｎｔとすると、１マクロブロックあたりの目標符号化ビット数は、Ｔ／ＭＢｃｎｔで表される。
【０００４】

量子化パラメータ＝占有量ｄ×視覚感度ｗ（ｋ）　　　　　　　　・・・（２）
【０００５】
【発明が解決しようとする課題】
しかしながら、前記従来技術においては、（１）式から明らかなように、基本的に視覚感度をマクロブロックの輝度分散のみで決定しており、マクロブロックごとの重み付けが動画像に対する人間の視覚感度に対応した最適なものには必ずしもなっていないという問題がある。
【０００６】
また、マクロブロックを構成する８×８画素のブロックの輝度分散ａ_ｘ（ｋ）のうちの最も小さい輝度分散ａｃｔ（ｋ）を用いて当該マクロブロックの視覚感度ｗ（ｋ）を算出しているため、精細領域中に平坦ブロックが単独で存在する場合に、算出した視覚感度は、平坦ブロックを含むマクロブロックとそれを含まないマクロブロックとで大きく異なり、性質がほぼ等しいと考えられる精細領域どうしの視覚感度が大きく変動するという問題がある。また、これにより算出した量子化パラメータを用いて量子化制御を行うと、単独で存在する平坦ブロックを含むマクロブロックは、ノイズがあまり目立たないにもかかわらず量子化が精細に行われ、それに対して多くのビット数が配分されるため、最適なビット配分がなされているとは言えない。
【０００７】
本発明の目的は、動画像に対する人間の視覚特性を高精度に表す視覚優先度情報を得ることができる動画像符号化のための前処理方式を提供することを目的とするものであり、これにより得られる視覚優先度情報を用いて動画像符号化における量子化制御を行えば最適なビット配分で画面全体の主観画質の大幅な向上が可能となる。
【０００８】
【課題を解決するための手段】
前記した課題を解決するために、本発明は、入力される画面単位に解析を行い、マクロブロックごとの視覚優先度情報を算出する視覚優先度情報算出手段を備え、前記視覚優先度情報算出手段により算出した視覚優先度情報を符号化部へ出力する点に第１の特徴がある。
【０００９】
また、本発明は、前記視覚優先度情報算出手段が、オブジェクトごとの注視度とマクロブロックごとのテクスチャ属性を考慮して視覚優先度情報を算出する点に第２の特徴がある。
【００１０】
また、本発明は、前記オブジェクトが、同一の属性を有するマクロブロックのグループ化に基づいて抽出されたものである点に第３の特徴がある。
【００１１】
また、本発明は、前記マクロブロックのグループ化のための分類基準値が、対応する輝度成分、色差成分、動き成分のうちの少なくとも一つの指標を用いて定義されたものである点に第４の特徴がある。
【００１２】
また、本発明は、前記オブジェクトごとの注視度を、該オブジェクトに含まれるマクロブロックの動き量の平均、マクロブロックごとに動きが散乱している度合、該オブジェクトがピクチャ内で目立つ度合いのうちの少なくとも一つの指標を用いて定義する点に第５の特徴がある。
【００１３】
さらに、本発明は、前記マクロブロックごとのテクスチャ属性を、該マクロブロックを構成する輝度ブロックの分散値を考慮して決める点に第６の特徴がある。
【００１４】
第１の特徴によれば、解析を画面単位に行ってマクロブロックごとの視覚優先度情報を算出しているため、マクロブロックの輝度分散のみを基準にして視覚感度を算出するものに比べて人間の視覚特性により適合した情報を得ることができる。
【００１５】
また、第２、第５および第６の特徴によれば、オブジェクトごとの注視度とマクロブロックごとのテクスチャ属性により動画像に対する人間の視覚感度が高精度に反映された視覚優先度情報を算出することができる。
【００１６】
また、第３および第４の特徴によれば、オブジェクトを適切に抽出することができ、オブジェクトごとの注視度の算出に供することができる。
【００１７】
【発明の実施の形態】
以下、図面を参照して本発明を詳細に説明する。図１は、本発明に係る前処理部１と動画像符号化部２を備えた動画像符号化装置の一例のブロック構成図である。同図において、入力画像（動画像）は、動き補償（ＭＣ）フレーム間予測及びＤＣＴ符号化器３に入力され、動き補償フレーム間予測によって得られた予測誤差がＤＣＴ係数に変換される。このＤＣＴ係数は、適応量子化器４で量子化レベルに置き換えられ、さらに可変長符号化部５で可変長符号化されて符号化データとなり、動きベクトル情報とともに多重化された後、バッファメモリに蓄えられる。
【００１８】
前処理装置１は、詳細は後述するが、入力画像に基づいて該入力画像に対する人間の視覚感度を推定し、例えば１６×１６画素のマクロブロックごとの視覚優先度情報を算出し、この視覚優先度情報を適応量子化器４に出力する。
【００１９】
適応量子化器４は、前処理部１から入力される視覚優先度情報に応じて量子化パラメータを算出し、この量子化パラメータに基づいて一画面内における各マクロブロックのビット配分を決定する。
【００２０】
図２は、前処理部１における処理の一実施形態のフロー図である。以下の説明では、各マクロブロックをｋで識別し、画面内に存在する各オブジェクトをｊで識別する。
【００２１】
まず、入力画像から画面単位でオブジェクトを抽出する（Ｓ１）。このオブジェクトの抽出は、マクロブロックをその属性に基づいてグループ化することにより行うことができ、グループ化のための基準とする属性（分類基準値）は、マクロブロックごとの輝度成分、色差成分、動き量のうちの少なくとも一つの指標を用いて定義することができる。この処理により、画面内の、例えば人物、車両、背景、あるいはそれらが画像の特徴によりさらに区分された部分ごとにマクロブロックがグループ化される。以下では、あるオブジェクトを構成するマクロブロック、すなわちグループ化されたマクロブロックを総称しＭＢグループと呼ぶ。
【００２２】
図３は、動き量と２つの色差成分Ｐｂ、Ｐｒ（０〜２５５レベル）とからなる３次元座標を分類基準値とする例を示し、一画面内の各マクロブロックについて動き量および色差成分Ｐｂ、Ｐｒを求めてこの３次元座標上にプロットし、３次元座標位置の近いマクロブロックを同一グループとしてグループ化する。例えば、動き量を２段階に分け、色差成分Ｐｂ、Ｐｒをそれぞれ５段階に分ければ、マクロブロックを最大５０種類の属性にグループ化することができる。
【００２３】
なお、色差成分Ｐｂ、Ｐｒは、マクロブロックにおける各色差成分の平均値とすればよく、マクロブロックの動き量は、まず、再生順における直前のフレームを参照してマクロブロックの動きベクトルＶｆ（ｋ）を求め、次に、この動きベクトルＶｆ（ｋ）に対してパンやズームなどのカメラ操作に起因する成分を除くグローバル動き補正を行ってオブジェクトに特化した動きのみを示す補正ベクトルＶｇ（ｋ）を求め、その絶対値｜Ｖｇ（ｋ）｜を動き量とすることにより求めることができる。
【００２４】
次に、Ｓ１で抽出した各オブジェクトについてその注視度パラメータＶ（ｊ）を算出する（Ｓ２）。オブジェクトごとの注視度パラメータＶ（ｊ）は、該オブジェクトに含まれるマクロブロックの動き量の平均Ｌ（ｊ）、オブジェクト内でマクロブロックごとに動きが散乱している度合Ｒ（ｊ）、当該オブジェクトがピクチャ内で目立つ度合いＫ（ｊ）のうちの少なくとも一つの指標を用いて定義することができる。以下に前記３つの指標Ｌ（ｊ）、Ｒ（ｊ）、Ｋ（ｊ）を用いて注視度パラメータＶ（ｊ）を算出する例について説明する。
【００２５】
まず、オブジェクトに含まれるマクロブロックの動き量の平均Ｌ（ｊ）を、グローバル補正前の動き量Ｖｆ（ｋ）の絶対値｜Ｖｆ（ｋ）｜のＭＢグループ内平均を求めることにより算出する。
【００２６】
また、オブジェクト内でマクロブロックごとに動きが散乱している度合Ｒ（ｊ）を、同一ＭＢグループ内の全マクロブロックについて、同一ＭＢグループに含まれる隣接マクロブロック（これをｋ′で識別する。）に対するＶｇ（ｋ）の散乱度Ｒ（ｊ，ｋ）を下記（３）式に従い算出した上で、ＭＢグループ内平均を求めることにより算出する。
【００２７】

ここで、Ｖｇｘ、Ｖｇｙは、Ｖｇのそれぞれｘ軸方向成分、ｙ軸方向成分を表す。
【００２８】
また、当該オブジェクトがピクチャ内で目立つ度合いＫ（ｊ）を、当該オブジェクトの希少性Ｋａ（ｊ）と異質性Ｋｂ（ｊ）とから算出する。希少性Ｋａ（ｊ）は、前記マクロブロックの分類基準と同様に、ＭＢグループ内の平均動き量、平均色差成分により算出したＭＢグループの分類基準値に基づいて、例えば５０種類に分類し、それら分類されたものの画面内での発生頻度を表すヒストグラムを求め、発生頻度が小さいもの大きな値をとるよう定義される。
【００２９】
また、異質性Ｋｂ（ｊ）は、異なるＭＢグループと隣接するポイント（これをｐで識別する。）において、ＭＢグループ間での分類基準値の差Ｋｂ（ｊ，ｐ）を求め、この差Ｋｂ（ｊ，ｐ）のＭＢグループ内平均を求めることにより算出できる。
【００３０】
オブジェクトがピクチャ内で目立つ度合いＫ（ｊ）は、前記のようにして算出した希少性Ｋａ（ｊ）と異質性Ｋｂ（ｊ）とから下記（４）、（５）式により算出できる。ただし、関数Ｓは、引数とする関数の出力を平滑化するために導入するシグモイド関数である。

【００３１】
各オブジェクトについての注視度パラメータＶ（ｊ）は、前記のようにして算出した３つの指標Ｌ（ｊ）、Ｒ（ｊ）、Ｋ（ｊ）を用いて下記（６）式により算出できる。
Ｖ（ｊ）＝Ｓ（Ｋ（ｊ））／（Ｓ（Ｌ（ｊ））×Ｓ（Ｒ（ｊ）））・・・（６）
【００３２】
これにより算出した注視度パラメータＶ（ｊ）は、動きが小さく、動きの散乱が小さく、ピクチャ内で目立っているオブジェクトに対して大きな値となり、人間の動視力特性および注視特性に合ったものとなる。
【００３３】
図２に戻って、Ｓ３では、マクロブロック単位でテクスチャ属性パラメータを決定する。マクロブロックごとのテクスチャ属性パラメータは、マクロブロックを構成する輝度ブロックの分散値を考慮して決めることができる。
【００３４】
図４は、このテクスチャ属性パラメータの決定の原理説明図であり、まず、マクロブロック中に存在する、例えば８×８画素のブロックの輝度分散値ｌｖ（ｍ）（ここではブロックをｍで識別する。）としては、当該ブロック及びその上下左右に隣接する４ブロックの計５ブロックの輝度分散値にランク　オーダ　フィルタ（ｒａｎｋ　ｏｒｄｅｒ　ｆｉｌｔｅｒ）を適用し、最低値以外の輝度分散値、例えば２番目に小さな値を抽出したものを補正された輝度分散値ｃｌｖ（ｍ）として適用する。なお、各ブロックについての補正された輝度分散値として最小値を抽出すると、単独の孤立した平坦ブロックの輝度分散値が適用されるケースがあるため、前記のように最低値以外の輝度分散値、例えば２番目に小さな値を適用することが好ましい。
【００３５】
マクロブロック内のブロックについて、以上のようにして抽出した輝度分散のうちの最小値あるいは平均値を当該マクロブロックの補正輝度分散値Ａ（ｋ）とする。
【００３６】
また、飛び越し走査により動画像が再生されるものである場合、図４に示すように、フレームにおけるブロックの輝度分散ａ_ｘ，ｙ以外に、当該フレームを構成する第１フィールドおよび第２フィールドのブロックにおける輝度分散ｂ_ｘ，ｙにもフィルタを適用してそれぞれ、例えば２番目に小さな値ａ′_０，０、・・・を抽出し、それらを含めた輝度分散のうちの最小値あるいは平均値を当該マクロブロックの補正輝度分散値Ａ（ｋ）とすることが好ましい。
【００３７】
テクスチャ属性パラメータｔ（ｋ）は、前記のようにして算出した補正輝度分散値Ａ（ｋ）を関数Ｓに適用することにより下記（７）式で求めることができる。
ｔ（ｋ）＝Ｓ（Ａ（ｋ））　　　　　　　　　　　　　　　　　　・・・（７）
【００３８】
また、ＭＢグループどうしの境界に位置するマクロブロックでは、隣接するＭＢグループの属性が混在している可能性が高い。さらに、そのようなマクロブロックは、人間は色の変化が大きい部分あるいはエッジ部を注視する傾向がある、という部分に該当する可能性が高い。そのため、このような領域では、視覚優先度を高くすることにより主観画質の向上が期待できる。そこで、ＭＢグループの境界に位置するマクロブロックＢＭＢに対し、テクスチャ属性パラメータの補正を以下のとおり行うこととする。
【００３９】
まず、ＢＭＢおよびその上下左右に隣接する４つのマクロブロックのオブジェクト注視度パラメータＶ（ｊ_０）の最大値をＶ_ｍａｘとする。さらに、ＢＭＢおよびその上下左右に隣接する４つのマクロブロックのテクスチャ属性パラメータｔ（ｋ）の最小値をｔ_ｍｉｎとする。これら最大値および最小値を用いてＢＭＢのテクスチャ属性パラメータｔ（ｋ）を下記（８）式で求める。
ｔ（ｋ）＝（Ｖ（ｊ_０）／Ｖ_ｍａｘ）×ｔ_ｍｉｎ　　　　　　　　・・・（８）
【００４０】
次に、Ｓ４（図２）では、前記（６）、（７）あるいは（８）式で求めたＶ（ｊ）およびｔ（ｋ）を用い、下記（９）式により視覚優先度パラメータｗ（ｋ）を算出する。
ｗ（ｋ）＝ｔ（ｋ）／Ｖ（ｊ）　　　　　　　　　　　　　　　　・・・（９）この視覚優先度パラメータｗ（ｋ）は、適応量子化器４（図１）に出力され、量子化パラメータの決定に使用される。
【００４１】
以上、本発明の実施形態について説明したが、本発明は、種々に変更および修正が可能である。例えば、オブジェクトの抽出、すなわちマクロブロックのグループ化に際し、分類基準により分類されたＭＢグループの面積が小さい場合、上下左右方向で隣接する、分類基準値の近い隣接ＭＢグループを、ＭＢグループの面積が予め決められた面積を上回るまで統合して最終的なＭＢグループとすることができ、これによれば算出される視覚優先度の精度を低下させることなく処理を軽減することができる。
【００４２】
また、オブジェクトに含まれるマクロブロックの動き量の平均Ｌ（ｊ）の算出に際し、他のものから極端に異なっている動き量を算出対象から除外するようにすることにより、ノイズなどの影響をなくすことができる。
【００４３】
【発明の効果】
以上に詳細に説明したように、本発明によれば、符号化に先立って画面内のオブジェクトおよびマクロブロックの構成を高精度に解析することができ、符号化部における画面内の局所的ビット配分の最適化に供する解析データを得ることができる。これにより従来の動画像符号化で問題とされていた、狭帯域下でのＨＤＴＶ（ｈｉｇｈ　ｄｅｆｉｎｉｔｉｏｎ　ＴＶ）放送における画質劣化を解消できる。
【００４４】
放送局向け映像サービスに本発明を適用すれば、圧縮伝送用コーディックのさらなる高効率化が見込まれ、高画質のＨＤＴＶ伝送が低レートで実現可能になり、ＳＮＧ（ｓａｔｅｌｌｉｔｅ　ｎｅｗｓ　ｇａｔｈｅｒｉｎｇ）やＦＰＵ（ｆｉｌｅｄ　ｐｉｃｋ　ｕｐ）などの狭帯域下での映像サービスをより一層充実させることができる。
【００４５】
また、ＦＴＴＨ（ｆｉｂｅｒ　ｔｏ　ｔｈｅ　ｈｏｍｅ）などのブロードバンド系の映像提供サービスに本発明を適用すれば、低レート・高画質のＨＤＴＶ符号化技術の活用によりＩＰベースのＨＤＴＶ配信が可能になる。
【００４６】
なお、本発明は、高画質な映像伝送システムを実現するために、ＭＰＥＧ−２やＭｏｔｉｏｎ　ＪＰＥＧ２０００によるＨＤＴＶ／ＳＤＴＶ（ｓｔａｎｄａｒｄ　ｄｅｆｉｎｉｔｉｏｎ　ＴＶ）など動画像圧縮符号化を扱うシステム全般に適用できる。
【図面の簡単な説明】
【図１】本発明に係る前処理部と動画像符号化部を備えた動画像符号化装置の一例のブロック構成図である。
【図２】図１の前処理部における処理の一実施形態のフロー図である。
【図３】オブジェクト抽出のための分類基準の説明図である。
【図４】テクスチャ属性パラメータの決定の原理説明図である。
【図５】ＭＰＥＧ−２の符号化検証モデルＴｅｓｔ　Ｍｏｄｅｌ５［１］の概念図である。
【符号の説明】
１・・・前処理部、２・・・動画像符号化部、３・・・ＭＣ＋ＤＣＴ符号化器、４・・・適応量子化器、５・・・可変長符号化部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a pre-processing method for moving image coding, and more particularly to a pre-processing method for moving image coding capable of obtaining visual priority information representing a human visual characteristic of a moving image with high accuracy. Things.
[0002]
[Prior art]
In the test model 5 [1] of the MPEG-2 encoding verification model, which is an international standard for video coding, as shown in FIG. 5, the number of bits generated for each macroblock is input, and the target number of encoded bits is input. Is introduced, and the excess or deficiency of the number of generated bits with respect to the target number of encoded bits is accumulated in this virtual buffer, and is fed back to the quantization control.
[0003]
Here, the weighting factor for each macroblock (16 × 16 pixels), that is, the visual sensitivity w (k) is calculated by the following equation (1), and the quantization parameter is determined by the visual sensitivity w (k) and the occupation amount d of the virtual buffer. Is calculated using the following equation (2). If the target number of bits of a picture is T and the number of macroblocks in a picture is MBcnt, the target number of coded bits per macroblock is represented by T / MBcnt.
[0004]

Quantization parameter = occupancy d × visual sensitivity w (k) (2)
[0005]
[Problems to be solved by the invention]
However, in the above prior art, as is apparent from equation (1), the visual sensitivity is basically determined only by the luminance variance of the macroblock, and the weighting for each macroblock affects the human visual sensitivity to the moving image. There is a problem that it is not always the optimal one that corresponds.
[0006]
Also to calculate the visual sensitivity w (k) of the macro block using the smallest luminance dispersion act (k) of the luminance dispersion a _x of the block of 8 × 8 pixels forming the macroblock _(k) Therefore, when a flat block exists alone in the fine region, the calculated visual sensitivities are greatly different between macroblocks including the flat block and macroblocks not including the flat block, and are compared between the fine regions that are considered to have substantially the same properties. There is a problem that the visual sensitivity of the image fluctuates greatly. In addition, when quantization control is performed using the quantization parameter calculated as described above, a macro block including a flat block that exists alone is quantized finely even though noise is not so noticeable, whereas Therefore, it cannot be said that optimal bit allocation is performed.
[0007]
SUMMARY OF THE INVENTION An object of the present invention is to provide a pre-processing method for moving image encoding that can obtain visual priority information representing a human visual characteristic of a moving image with high accuracy. If the quantization control in the moving image coding is performed using the visual priority information obtained by the above, the subjective image quality of the entire screen can be greatly improved by the optimal bit allocation.
[0008]
[Means for Solving the Problems]
In order to solve the above-described problem, the present invention includes a visual priority information calculating unit that performs analysis for each input screen and calculates visual priority information for each macroblock, The first feature is that the visual priority information calculated by the above is output to the encoding unit.
[0009]
Further, the present invention has a second feature in that the visual priority information calculating means calculates the visual priority information in consideration of the degree of gaze of each object and the texture attribute of each macroblock.
[0010]
Further, the present invention has a third feature in that the object is extracted based on grouping of macroblocks having the same attribute.
[0011]
Further, the present invention is characterized in that the classification reference value for grouping the macroblocks is defined using at least one index of a corresponding luminance component, color difference component, and motion component. There is a feature.
[0012]
Also, the present invention provides the gaze degree for each object as an average of the amount of motion of a macroblock included in the object, a degree of motion scattered for each macroblock, and a degree of the object being conspicuous in a picture. A fifth feature lies in that the definition is made using at least one index.
[0013]
Furthermore, the present invention has a sixth feature in that the texture attribute of each macroblock is determined in consideration of the variance of the luminance block constituting the macroblock.
[0014]
According to the first feature, since the analysis is performed for each screen to calculate the visual priority information for each macroblock, the human visual sensitivity is calculated as compared with the case where the visual sensitivity is calculated based only on the luminance variance of the macroblock. , It is possible to obtain information that is more suitable for the visual characteristics.
[0015]
According to the second, fifth, and sixth features, visual priority information in which human visual sensitivity to a moving image is accurately reflected is calculated based on a gaze degree for each object and a texture attribute for each macroblock. be able to.
[0016]
Further, according to the third and fourth features, an object can be appropriately extracted, and can be used for calculating a gaze degree for each object.
[0017]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram of an example of a moving picture coding apparatus including a preprocessing unit 1 and a moving picture coding unit 2 according to the present invention. In the figure, an input image (moving image) is input to a motion compensation (MC) inter-frame prediction and DCT encoder 3, and a prediction error obtained by the motion compensation inter-frame prediction is converted into a DCT coefficient. The DCT coefficient is replaced with a quantization level by the adaptive quantizer 4, is further subjected to variable-length encoding by the variable-length encoding unit 5, becomes encoded data, and is multiplexed with motion vector information. It is stored.
[0018]
The preprocessing device 1 estimates human visual sensitivity to the input image based on the input image, calculates visual priority information for each macroblock of, for example, 16 × 16 pixels, and calculates the visual priority information. The degree information is output to the adaptive quantizer 4.
[0019]
The adaptive quantizer 4 calculates a quantization parameter according to the visual priority information input from the preprocessing unit 1, and determines the bit allocation of each macroblock in one screen based on the quantization parameter.
[0020]
FIG. 2 is a flowchart of one embodiment of the processing in the preprocessing unit 1. In the following description, each macroblock is identified by k, and each object existing in the screen is identified by j.
[0021]
First, an object is extracted from the input image for each screen (S1). This object can be extracted by grouping macroblocks based on their attributes, and the attributes (classification reference values) used as the reference for grouping are a luminance component, a color difference component, It can be defined using at least one index of the motion amount. By this processing, macroblocks are grouped by, for example, a person, a vehicle, a background, or a portion in which they are further divided by image characteristics. In the following, macroblocks that constitute a certain object, that is, grouped macroblocks, are collectively called an MB group.
[0022]
FIG. 3 shows an example in which three-dimensional coordinates composed of a motion amount and two color difference components Pb and Pr (0 to 255 levels) are used as a classification reference value. For each macroblock in one screen, the motion amount and the color difference component Pb , Pr are obtained and plotted on the three-dimensional coordinates, and macroblocks whose three-dimensional coordinate positions are close are grouped as the same group. For example, if the motion amount is divided into two stages and the color difference components Pb and Pr are divided into five stages, macroblocks can be grouped into a maximum of 50 types of attributes.
[0023]
Note that the color difference components Pb and Pr may be the average value of each color difference component in the macroblock, and the motion amount of the macroblock is determined by first referring to the immediately preceding frame in the reproduction order by using the motion vector Vf (k ) Is obtained, and global motion correction is performed on the motion vector Vf (k) to exclude components caused by camera operations such as panning and zooming, and a correction vector Vg (k) indicating only a motion specific to the object is obtained. ) Is obtained, and its absolute value | Vg (k) | is used as the amount of motion.
[0024]
Next, the gaze degree parameter V (j) is calculated for each object extracted in S1 (S2). The gaze degree parameter V (j) for each object is an average L (j) of the amount of motion of the macroblock included in the object, the degree R (j) of the motion scattered for each macroblock in the object, the object Can be defined using at least one index of the degree of prominence K (j) in the picture. An example in which the gaze degree parameter V (j) is calculated using the three indices L (j), R (j), and K (j) will be described below.
[0025]
First, the average L (j) of the motion amounts of the macroblocks included in the object is calculated by calculating the average in the MB group of the absolute value | Vf (k) | of the motion amount Vf (k) before global correction.
[0026]
In addition, the degree R (j) of motion scattered for each macroblock in the object is determined for all macroblocks in the same MB group by adjacent macroblocks included in the same MB group (this is identified by k ′). ), The scattering degree R (j, k) of Vg (k) is calculated according to the following equation (3), and then the average within the MB group is calculated.
[0027]

Here, Vgx and Vgy represent the x-axis direction component and the y-axis direction component of Vg, respectively.
[0028]
Further, the degree K (j) of the object that stands out in the picture is calculated from the rarity Ka (j) and the heterogeneity Kb (j) of the object. The rarity Ka (j) is classified into, for example, 50 types based on the classification reference value of the MB group calculated based on the average motion amount and the average color difference component in the MB group in the same manner as the classification standard of the macro block. A histogram representing the frequency of occurrence of the classified items in the screen is obtained, and is defined so as to take a large value with a small occurrence frequency.
[0029]
Further, the heterogeneity Kb (j) is obtained by calculating the difference Kb (j, p) of the classification reference value between the MB groups at a point adjacent to the different MB group (identified by p). It can be calculated by calculating the average of (j, p) in the MB group.
[0030]
The degree K (j) at which the object stands out in the picture can be calculated from the scarcity Ka (j) and the heterogeneity Kb (j) calculated as described above by the following equations (4) and (5). However, the function S is a sigmoid function introduced to smooth the output of the function as an argument.

[0031]
The gaze degree parameter V (j) for each object can be calculated by the following equation (6) using the three indices L (j), R (j), and K (j) calculated as described above.
V (j) = S (K (j)) / (S (L (j)) × S (R (j))) (6)
[0032]
The gaze degree parameter V (j) calculated in this way has a small motion, a small scattering of motion, and a large value for an object that is conspicuous in the picture, and is suitable for human dynamic visual acuity characteristics and gaze characteristics. Become.
[0033]
Returning to FIG. 2, in S3, a texture attribute parameter is determined for each macroblock. The texture attribute parameter for each macroblock can be determined in consideration of the variance value of the luminance block constituting the macroblock.
[0034]
FIG. 4 is a diagram for explaining the principle of determining the texture attribute parameter. First, the luminance variance lv (m) of a block of, for example, 8 × 8 pixels existing in a macroblock (here, the block is identified by m) )), A rank order filter is applied to the luminance variance values of a total of five blocks, that is, the block and four blocks adjacent to the top, bottom, left, and right, and a luminance variance value other than the lowest value, for example, the second smallest value The extracted value is applied as a corrected luminance variance value clv (m). When the minimum value is extracted as the corrected luminance variance value for each block, the luminance variance value of a single isolated flat block may be applied. For example, it is preferable to apply the second smallest value.
[0035]
The minimum value or the average value of the luminance variances extracted as described above for the blocks in the macroblock is set as the corrected luminance variance value A (k) of the macroblock.
[0036]
When a moving image is reproduced by interlaced scanning, as shown in FIG. 4, in addition to the luminance variances ax _{and y} of the blocks in the frame, the blocks of the first field and the second field constituting the frame are used. Are also applied to the luminance variance b _{x, y} at, respectively, to extract, for example, the second smallest value a ′ _0,0,. It is preferable to set the corrected luminance variance value A (k) of the macro block.
[0037]
The texture attribute parameter t (k) can be determined by the following equation (7) by applying the corrected luminance variance value A (k) calculated as described above to the function S.
t (k) = S (A (k)) (7)
[0038]
Also, macroblocks located at the boundaries between MB groups are likely to have mixed attributes of adjacent MB groups. Further, such a macroblock is likely to correspond to a part where humans tend to gaze at a large color change or an edge part. Therefore, in such an area, improvement of the subjective image quality can be expected by increasing the visual priority. Therefore, correction of the texture attribute parameter is performed on the macroblock BMB located at the boundary of the MB group as follows.
[0039]
First, let the maximum value of the object gaze degree parameter V (j ₀ ) of the BMB and four macroblocks adjacent to the top, bottom, left, and right be V _max . Further, the minimum value of the texture attribute parameter t (k) of the BMB and four macroblocks adjacent to the top, bottom, left, and right thereof is defined as _tmin . The texture attribute parameter t (k) of the BMB is obtained by the following equation (8) using the maximum value and the minimum value.
t (k) = (V (j ₀ ) / V _max ) × t _min (8)
[0040]
Next, in S4 (FIG. 2), V (j) and t (k) obtained by the above equations (6), (7) or (8) are used, and the visual priority parameter w ( k) is calculated.
w (k) = t (k) / V (j) (9) The visual priority parameter w (k) is output to the adaptive quantizer 4 (FIG. 1), and is used to determine the quantization parameter. used.
[0041]
Although the embodiments of the present invention have been described above, the present invention can be variously changed and modified. For example, when extracting an object, that is, when grouping macroblocks, if the area of an MB group classified according to the classification criterion is small, an adjacent MB group that is adjacent in the vertical, horizontal, and vertical directions and has a close classification criterion value, The MB groups can be integrated until the area exceeds a predetermined area to form a final MB group, whereby the processing can be reduced without lowering the accuracy of the calculated visual priority.
[0042]
Further, when calculating the average L (j) of the motion amounts of the macroblocks included in the object, the influence of noise or the like is eliminated by excluding the motion amount extremely different from the others from the calculation target. be able to.
[0043]
【The invention's effect】
As described in detail above, according to the present invention, the configuration of objects and macroblocks in a screen can be analyzed with high accuracy prior to encoding, and the local bit allocation in the screen in the encoding unit can be performed. It is possible to obtain analysis data to be used for optimization of. As a result, it is possible to eliminate image quality degradation in HDTV (high definition TV) broadcasting in a narrow band, which has been a problem in conventional video coding.
[0044]
If the present invention is applied to a video service for a broadcasting station, it is expected that the efficiency of the codec for compressed transmission will be further improved, and high-definition HDTV transmission can be realized at a low rate, and SNG (satellite news gathering) or FPU (filled) The video service under a narrow band such as “pick up” can be further enhanced.
[0045]
Further, if the present invention is applied to a broadband video providing service such as FTTH (fiber to the home), IP-based HDTV distribution becomes possible by utilizing a low-rate and high-quality HDTV encoding technology.
[0046]
The present invention can be applied to all systems that handle moving image compression and encoding, such as standard definition TV (HDTV / SDTV) based on MPEG-2 or Motion JPEG2000, in order to realize a high-quality video transmission system.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating an example of a moving image encoding apparatus including a preprocessing unit and a moving image encoding unit according to the present invention.
FIG. 2 is a flowchart of an embodiment of a process in a preprocessing unit in FIG. 1;
FIG. 3 is an explanatory diagram of classification criteria for object extraction.
FIG. 4 is a diagram illustrating the principle of determining a texture attribute parameter.
FIG. 5 is a conceptual diagram of an MPEG-2 encoding verification model Test Model 5 [1].
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Pre-processing part, 2 ... Video encoding part, 3 ... MC + DCT encoder, 4 ... Adaptive quantizer, 5 ... Variable length encoding part

Claims

In the pre-processing method for video coding,
It is provided with a visual priority information calculating unit that analyzes the input screen unit and calculates visual priority information for each macro block,
A preprocessing method for moving image encoding, wherein the visual priority information calculated by the visual priority information calculating means is output to an encoding unit.

2. The moving image encoding apparatus according to claim 1, wherein the visual priority information calculating unit calculates the visual priority information in consideration of a gazing degree of each object and a texture attribute of each macroblock. 3. Preprocessing method.

The pre-processing method according to claim 2, wherein the objects are extracted based on grouping of macroblocks having the same attribute.

The method according to claim 3, wherein the classification reference value for grouping the macroblocks is defined using at least one of a luminance component, a color difference component, and a motion component. Pre-processing method for moving image coding.

The gaze level of each object is determined by using at least one of an average of motion amounts of macroblocks included in the object, a degree of motion scattered for each macroblock, and a degree of the object being noticeable in a picture. 5. The pre-processing method for video coding according to claim 2, wherein the pre-processing method is defined as follows.

The pre-processing for moving image encoding according to any one of claims 2 to 5, wherein a texture attribute for each macro block is determined in consideration of a variance value of a luminance block forming the macro block. method.