JP2004007571A

JP2004007571A - Encoding apparatus and method, decoding apparatus and method, editing apparatus and method, recording medium, and program

Info

Publication number: JP2004007571A
Application number: JP2003107787A
Authority: JP
Inventors: Teruhiko Suzuki; 鈴木　輝彦
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2002-04-26
Filing date: 2003-04-11
Publication date: 2004-01-08
Anticipated expiration: 2023-04-11
Also published as: JP4875285B2

Abstract

【課題】バッファの破綻が発生しないような符号化および復号を行う。
【解決手段】ビットストリーム中のアクセス可能なポイントに含まれるランダムアクセスポイントヘッダ内に、最小ビットレート、最小バッファサイズ、最小初期遅延時間などの情報を含ませる。ビットストリーム解析部７２は、入力されたビットストリームを解析し、上述したような情報を設定し、バッファ情報付加部７３に出力する。バッファ情報付加部７３は、入力されたビットストリームに、入力した情報を付加して出力する。本発明は、ビットストリームを扱う符号化装置や復号装置に適用できる。
【選択図】　　　図５Encoding and decoding are performed so that a buffer failure does not occur.
Information such as a minimum bit rate, a minimum buffer size, and a minimum initial delay time is included in a random access point header included in an accessible point in a bit stream. The bit stream analyzing unit 72 analyzes the input bit stream, sets information as described above, and outputs the information to the buffer information adding unit 73. The buffer information adding unit 73 adds the input information to the input bit stream and outputs it. The present invention can be applied to an encoding device and a decoding device that handle a bit stream.
[Selection] Figure 5

Description

【０００１】
【発明の属する技術分野】
本発明は符号化装置および方法、復号装置および方法、編集装置および方法、記録媒体、並びにプログラムに関し、特に、離散コサイン変換若しくはカルーネン・レーベ変換等の直交変換と動き補償によって圧縮された画像情報（ビットストリーム）を、衛星放送、ケーブルテレビジョン放送、インターネットなどのネットワークメディアを介して送受信する際に、若しくは光ディスク、磁気ディスク、フラッシュメモリのような記憶メディア上で処理する際に用いて好適な符号化装置および方法、復号装置および方法、編集装置および方法、記録媒体、並びにプログラムに関する。
【０００２】
【従来の技術】
近年、画像情報をデジタルとして取り扱い、その際、効率の良い情報の伝送、蓄積を目的とし、画像情報特有の冗長性を利用して、離散コサイン変換等の直交変換と動き補償により圧縮するＭＰＥＧ（Ｍｏｖｉｎｇ　Ｐｉｃｔｕｒｅ　Ｅｘｐｅｒｔ　Ｇｒｏｕｐ）などの方式に準拠した装置が、放送局などの情報配信、および一般家庭における情報受信の双方において普及しつつある。
【０００３】
特に、ＭＰＥＧ２（ＩＳＯ／ＩＥＣ　１３８１８−２）は、汎用画像圧縮方式として定義された規格であり、飛び越し走査画像及び順次走査画像の双方、並びに標準解像度画像及び高精細画像を網羅する標準で、例えばＤＶＤ（Ｄｉｇｉｔａｌ　Ｖｅｒｓａｔｉｌｅ　Ｄｉｓｋ）規格に代表されるように、プロフェッショナル用途及びコンシューマ用途の広範なアプリケーションに広く用いられている。
【０００４】
このＭＰＥＧ２圧縮方式を用いることにより、例えば、７２０×４８０画素を持つ標準解像度の飛び越し走査画像に対しては４乃至８Ｍｂｐｓ、１９２０×１０８８画素を持つ高解像度の飛び越し走査画像に対しては１８乃至２２Ｍｂｐｓの符号量（ビットレート）を割り当てることで、高い圧縮率と良好な画質の実現が可能である。
【０００５】
ＭＰＥＧ２は主として放送用に適合する高画質符号化を対象としていたが、より高い圧縮率の符号化方式には対応していなかったので、ＭＰＥＧ４符号化方式の標準化が行われた。画像符号化方式に関しては、１９９８年１２月にＩＳＯ／ＩＥＣ　１４４９６−２としてその規格が国際標準に承認された。
【０００６】
さらに、近年、テレビ会議用の画像符号化を当初の目的として、国際電気連合の電気通信標準化部門であるＩＴＵ−Ｔ　（Ｉｎｔｅｒｎａｔｉｏｎａｌ　Ｔｅｌｅｃｏｍｍｕｎｉｃａｔｉｏｎ　Ｕｎｉｏｎ　−　Ｔｅｌｅｃｏｍｍｕｎｉｃａｔｉｏｎ　Ｓｔａｎｄａｒｄｉｚａｔｉｏｎ　Ｓｅｃｔｏｒ）によるＨ．２６Ｌ（ＩＴＵ−Ｔ　Ｑ６／１６ＶＣＥＧ）という標準の規格化が進んでいる。Ｈ．２６Ｌは、ＭＰＥＧ２やＭＰＥＧ４といった符号化方式に比べ、その符号化、復号に、より多くの演算量が要求されるものの、より高い符号化効率が実現されることが知られている。
【０００７】
また、現在、ＭＰＥＧ４の活動の一環として、このＨ．２６Ｌに基づいた、より高い符号化効率を実現する符号化技術の標準化がＩＴＵ−Ｔと共同でＪＶＴ（Ｊｏｉｎｔ　Ｖｉｄｅｏ　Ｔｅａｍ）として行われている。
【０００８】
ここで、離散コサイン変換若しくはカルーネン・レーベ変換等の直交変換と動き補償とによる画像圧縮について説明する。図１は、従来の画像情報符号化装置の一例の構成を示す図である。
【０００９】
図１に示した画像情報符号化装置１０において、入力端子１１より入力されたアナログ信号からなる画像情報は、Ａ／Ｄ変換部１２により、デジタル信号に変換される。そして、画面並べ替えバッファ１３は、Ａ／Ｄ変換部１２より供給された画像情報のＧＯＰ（Ｇｒｏｕｐ　ｏｆ　Ｐｉｃｔｕｒｅｓ）構造に応じて、フレームの並べ替えを行う。
【００１０】
ここで、画面並べ替えバッファ１３は、イントラ（画像内）符号化が行われる画像に対しては、フレーム全体の画像情報を直交変換部１５に供給する。直交変換部１５は、画像情報に対して離散コサイン変換若しくはカルーネン・レーベ変換等の直交変換を施し、変換係数を量子化部１６に供給する。量子化部１６は、直交変換部１５から供給された変換係数に対して量子化処理を施す。
【００１１】
可逆符号化部１７は、量子化部１６から供給された量子化された変換係数や量子化スケール等から符号化モードを決定し、この符号化モードに対して可変長符号化、又は算術符号化等の可逆符号化を施し、画像符号化単位のヘッダ部に挿入される情報を形成する。そして、可逆符号化部１７は、符号化された符号化モードを蓄積バッファ１８に供給して蓄積させる。この符号化された符号化モードは、画像圧縮情報として出力端子１９より出力される。
【００１２】
また、可逆符号化部１７は、量子化された変換係数に対して可変長符号化、若しくは算術符号化等の可逆符号化を施し、符号化された変換係数を蓄積バッファ１８に供給して蓄積させる。この符号化された変換係数は、画像圧縮情報として出力端子１９より出力される。
【００１３】
量子化部１６の挙動は、蓄積バッファ１８に蓄積された変換係数のデータ量に基づいて、レート制御部２０によって制御される。また、量子化部２０は、量子化後の変換係数を逆量子化部２１に供給し、逆量子化部２１は、その量子化後の変換係数を逆量子化する。逆直交変換部２２は、逆量子化された変換係数に対して逆直交変換処理を施して復号画像情報を生成し、その情報をフレームメモリ２３に供給して蓄積させる。
【００１４】
また、画面並べ替えバッファ１３は、インター（画像間）符号化が行われる画像に関しては、画像情報を動き予測・補償部２４に供給する。動き予測・補償部２４は、同時に参照される画像情報をフレームメモリ２３より取り出し、動き予測・補償処理を施して参照画像情報を生成する。動き予測・補償部２４は、生成した参照画像情報を加算器１４に供給し、加算器１４は、参照画像情報を対応する画像情報との差分信号に変換する。また、動き予測・補償部２４は、同時に動きベクトル情報を可逆符号化部１７に供給する。
【００１５】
可逆符号化部１７は、量子化部１６から供給され量子化された変換係数および量子化スケール、並びに動き予測・補償部２４から供給された動きベクトル情報等から符号化モードを決定し、その決定した符号化モードに対して可変長符号化または算術符号化等の可逆符号化を施し、画像符号化単位のヘッダ部に挿入される情報を生成する。そして、可逆符号化部１７は、符号化された符号化モードを蓄積バッファ１８に供給して蓄積させる。この符号化された符号化モードは、画像圧縮情報として出力される。
【００１６】
また、可逆符号化部１７は、その動きベクトル情報に対して可変長符号化若しくは算術符号化等の可逆符号化処理を施し、画像符号化単位のヘッダ部に挿入される情報を生成する。
【００１７】
また、イントラ符号化と異なり、インター符号化の場合、直交変換部１５に入力される画像情報は、加算器１４より得られた差分信号である。なお、その他の処理については、イントラ符号化を施される画像圧縮情報と同様であるため、その説明を省略する。
【００１８】
次に、上述した画像情報符号化装置１０に対応する画像情報復号装置の一例の構成を図２に示す。図２に示した画像情報復号装置４０において、入力端子４１より入力された画像圧縮情報は、蓄積バッファ４２において一時的に格納された後、可逆復号部４３に転送される。
【００１９】
可逆復号部４３は、定められた画像圧縮情報のフォーマットに基づき、画像圧縮情報に対して可変長復号若しくは算術復号等の処理を施し、ヘッダ部に格納された符号化モード情報を取得し逆量子化部４４等に供給する。また同様に、可逆復号部４３は、量子化された変換係数を取得し逆量子化部４４に供給する。さらに、可逆復号部４３は、復号するフレームがインター符号化されたものである場合には、画像圧縮情報のヘッダ部に格納された動きベクトル情報についても復号し、その情報を動き予測・補償部５１に供給する。
【００２０】
逆量子化部４４は、可逆復号部４３から供給された量子化後の変換係数を逆量子化し、変換係数を逆直交変換部４５に供給する。逆直交変換部４５は、定められた画像圧縮情報のフォーマットに基づき、変換係数に対して逆離散コサイン変換若しくは逆カルーネン・レーベ変換等の逆直交変換を施す。
【００２１】
ここで、対象となるフレームがイントラ符号化されたものである場合、逆直交変換処理が施された画像情報は、画面並べ替えバッファ４７に格納され、Ｄ／Ａ変換部４８におけるＤ／Ａ変換処理の後に出力端子４９から出力される。
【００２２】
また、対象となるフレームがインター符号化されたものである場合、動き予測・補償部５１は、可逆復号処理が施された動きベクトル情報とフレームメモリ５０に格納された画像情報とに基づいて参照画像を生成し、加算器４６に供給する。加算器４６は、この参照画像と逆直交変換部４５からの出力とを合成する。なお、その他の処理については、イントラ符号化されたフレームと同様であるため、説明を省略する。
【００２３】
ところで、先に述べたＪｏｉｎｔ　Ｖｉｄｅｏ　Ｔｅａｍで標準化が行われている符号化方式（以下ＪＶＴ　Ｃｏｄｅｃ）では、ＭＰＥＧ２やＭＰＥＧ４などの符号化効率を改善するため、様々な方式が検討されている。例えば、離散コサイン変換の変換方法は、４×４ブロックサイズの整数係数変換が用いられている。そして、動き補償の際のブロックサイズが可変であり、より最適な動き補償が行えるようになっている。しかしながら、基本的な方式は、図１に示した画像情報符号化装置１０において行われる符号化方式と同様に行うことが可能であるようにされている。
【００２４】
従って、図２に示した画像情報復号装置４０において行われる復号方式と、基本的に同じ方式により復号することが可能であるようにされている。
【００２５】
ところで異なる復号装置（デコーダ）間での互換性を維持し、バッファをオーバーフローまたはアンダーフローさせないために、ＭＰＥＧやＩＴＵ−Ｔでは、バッファモデルが導入されている。仮想デコーダバッファモデルを標準で定義し、符号化装置（エンコーダ）は、この仮想デコーダバッファを破綻しないように符号化することによりデコーダ側でのバッファオーバーフローまたはアンダーフローを防ぎ、互換性を維持することが可能とされている。
【００２６】
ＭＰＥＧにおける仮想バッファモデルについて、図３を参照して説明する。以下の説明において、デコーダバッファへの入力ビットレートをＲ、デコーダバッファのサイズをＢ、デコーダが最初のフレームをバッファから引き抜く時のバッファ占有量をＦ、その際の遅延時間をＤとする。また、時刻ｔ０，ｔ１，ｔ２，・・・における各フレームのビット量をｂ０，ｂ１，ｂ２・・・とする。
【００２７】
ここでフレームレートをＭとすると、
ｔ_ｉ＋１−ｔ_ｉ＝１／Ｍが成り立つ。
【００２８】
Ｂ_ｉを、時刻ｔ_ｉにおけるフレームのビット量ｂ_ｉを引き抜く直前のバッファ占有量とすると以下の式（１）が成り立つ。
Ｂ_０＝Ｆ
Ｂ_ｉ＋１＝ｍｉｎ（Ｂ，Ｂ_ｉ―ｂ_ｉ＋Ｒ（ｔ_ｉ＋１−ｔ_ｉ））　・・・（１）
【００２９】
ここで、ＭＰＥＧ２における固定ビットレート符号化方式の場合、エンコーダは次式（２）の条件を満たすよう符号化しなければならない。
Ｂｉ≦Ｂ
Ｂｉ−ｂｉ≧０　　・・・（２）
このような条件が満たされている間は、エンコーダは、バッファオーバーフローやアンダーフローを発生させてしまうような符号化を行うようなことがないとされている。
【００３０】
また、ＭＰＥＧ２における可変ビットレート符号化方式の場合、入力ビットレートＲは、プロファイル、レベルで定義される最大ビットレートであり、Ｆ＝Ｂである。従って式（１）は、次式（３）のように書き換えられる。
Ｂ_０＝Ｂ
Ｂ_ｉ＋１＝ｍｉｎ（Ｂ，Ｂ_ｉ―ｂ_ｉ＋Ｒ_ｍａｘ（ｔ_ｉ＋１−ｔ_ｉ））　・・・（３）
【００３１】
この時、エンコーダは、次式（４）に表される条件を満たすように符号化を実行しなければならない。
Ｂ_ｉ―ｂ_ｉ≧０　　・・・（４）
この条件が満たされるとき、エンコーダは、デコーダ側でバッファアンダーフローが起こらないような符号化を行うことになる。デコーダバッファが一杯になった時は、エンコーダバッファは空であり、符号化ビットストリームが発生していないことを意味する。従って、エンコーダは、デコーダのバッファオーバーフローを起こさないように監視する必要は無い。
【００３２】
ＭＰＥＧでは、各プロファイル、レベルで定義されるバッファサイズ、ビットレートに基づいて上述したようなバッファの制約を守るように符号化が行なわれる。各プロファイル、レベルに準拠したデコーダは、そのビットストリームを破綻することなく復号することができる。
【００３３】
【発明が解決しようとする課題】
しかしながら、実際にはプロファイル、レベルに規定されたバッファサイズ、ビットレートを用いない場合でも、ビットストリームを復号することが出来る場合がある。
【００３４】
例えば、ビットレートＲ、バッファＢ、初期遅延時間Ｆ（Ｒ，Ｂ，Ｆ）で符号化されたビットストリームは、より大きなバッファサイズＢ’（Ｂ’＞Ｂ）を持つデコーダによっても復号可能である。また、より高いビットレートＲ’（Ｒ’＞Ｒ）で復号することも可能である。
【００３５】
例えば、デコーダの復号ビットレートが、符号化ビットレートより低い場合においても、十分大きなバッファサイズをもったデコーダであれば復号することが可能である。
【００３６】
このように、所定のビットストリームが与えられた場合、各ビットレートにおいて、そのビットストリームを復号するために必要な最小バッファサイズＢ_ｍｉｎが存在する。このような関係を図４に示す。
【００３７】
ＪＶＴ　Ｃｏｄｅｃでは、各プロファイル、レベルで固定のビットレート、バッファサイズで復号するだけでなく、図４に示したような条件を有するデコーダで復号できるように標準化が進められている。必ずしもエンコーダの符号化ビットレート、バッファサイズとデコーダの復号ビットレート、バッファサイズが同一でなくとも復号できることを目的としている。この目的が達成されることにより、例えば、復号ビットレートが高いデコーダでは、バッファサイズを削減することなどが可能になる。
【００３８】
しかしながら、このような情報は、ビットストリーム中で時間的に変動する。そのため、デコーダ互換のための制約が緩められている分、所定の条件下では復号可能であっても、別の条件下では復号不可能になる場合があるといった問題があった。例えば、このような（Ｒ，Ｂ）の特性が時間的に変動する場合、所定の時刻で復号可能であっても、別の時刻では復号不可能である可能性があるといった問題があった。
【００３９】
ランダムアクセスなどで、別なシーンや、別なチャンネルなどに移行した場合も、必ずしも復号可能であるとは限らなくなるといった問題があった。また、スプライシング（Ｓｐｌｉｃｉｎｇ）などビットストリームレベルでの編集を行った際、デコード可能性を保証できなくなるといった問題があった。
【００４０】
本発明はこのような状況に鑑みてなされたものであり、ビットストリームの復号可能性を効率よく判断し、またスプライシングなどビットストリームの編集を簡便に行えるようにすることを目的とする。
【００４１】
【課題を解決するための手段】
本発明の符号化装置は、復号時に必要に応じ参照されるヘッダを生成する生成手段と、生成手段により生成されたヘッダと、入力された画像信号をそれぞれ符号化する符号化手段と、符号化手段により符号化されたヘッダと画像信号を多重化し、ビットストリームを出力する出力手段とを含み、生成手段は、ビットストリームを復号する際のバッファに関するバッファ特性情報を含むヘッダを生成することを特徴とする。
【００４２】
前記生成手段は、ビットストリーム中でランダムにアクセスが可能な所定区間毎に、バッファ特性情報を含む前記ヘッダを生成するようにすることができる。
【００４３】
前記生成手段は、ビットストリームのシーケンス全体のバッファ特性の情報を含むヘッダを生成するようにすることができる。
【００４４】
前記バッファ特性情報は、ビットストリームを復号する際の復号可能な最小ビットレート、最小バッファサイズ、および、最小遅延量の全てを含むようにすることができる。
【００４５】
本発明の符号化方法は、復号時に必要に応じ参照されるヘッダを生成する生成ステップと、生成ステップの処理で生成されたヘッダと、入力された画像信号をそれぞれ符号化する符号化ステップと、符号化ステップの処理で符号化されたヘッダと画像信号を多重化したビットストリームの出力を制御する出力制御ステップとを含み、生成ステップの処理は、ビットストリームを復号する際のバッファに関するバッファ特性情報を少なくとも含むヘッダを生成することを特徴とする。
【００４６】
本発明の第１の記録媒体のプログラムは、復号時に必要に応じ参照されるヘッダを生成する生成ステップと、生成ステップの処理で生成されたヘッダと、入力された画像信号をそれぞれ符号化する符号化ステップと、符号化ステップの処理で符号化されたヘッダと画像信号を多重化したビットストリームの出力を制御する出力制御ステップとを含み、生成ステップの処理は、ビットストリームを復号する際のバッファに関するバッファ特性情報を少なくとも含むヘッダを生成することを特徴とする。
【００４７】
本発明の第１のプログラムは、復号時に必要に応じ参照されるヘッダを生成する生成ステップと、生成ステップの処理で生成されたヘッダと、入力された画像信号をそれぞれ符号化する符号化ステップと、符号化ステップの処理で符号化されたヘッダと画像信号を多重化したビットストリームの出力を制御する出力制御ステップとを含む処理をコンピュータに実行させ、生成ステップの処理は、ビットストリームを復号する際のバッファに関するバッファ特性情報を少なくとも含むヘッダを生成することを特徴とする。
【００４８】
本発明の復号装置は、入力されたビットストリーム内のヘッダを検索する検索手段と、検索手段により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出したバッファ特性情報に基づいてビットストリームを復号する復号手段とを含むことを特徴とする。
【００４９】
前記バッファ特性情報は、ビットストリームを復号する際の復号可能な最小ビットレート、最小バッファサイズ、および、最小遅延量の全てを含むようにすることができる。
【００５０】
本発明の復号方法は、入力されたビットストリーム内のヘッダを検索する検索ステップと、検索ステップの処理により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出したバッファ特性情報に基づいてビットストリームを復号する復号ステップとを含むことを特徴とする。
【００５１】
本発明の第２の記録媒体のプログラムは、入力されたビットストリーム内のヘッダを検索する検索ステップと、検索ステップの処理により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出したバッファ特性情報に基づいてビットストリームを復号する復号ステップとを含むことを特徴とする。
【００５２】
本発明の第２のプログラムは、入力されたビットストリーム内のヘッダを検索する検索ステップと、検索ステップの処理により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出したバッファ特性情報に基づいてビットストリームを復号する復号ステップとを含む処理をコンピュータに実行させることを特徴とする。
【００５３】
本発明の編集装置は、入力されたビットストリーム内のヘッダを検索する検索手段と、検索手段により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出した情報に基づいてビットストリームの編集が可能であるか否かを判断する判断手段と、判断手段によりビットストリームの編集が可能であると判断された場合、ビットストリームの編集を行う編集手段とを含み、判断手段は、第１のビットストリームのヘッダに含まれる情報により作成される特性曲線が、第２のビットストリームのヘッダに含まれる情報により作成される特性曲線の常に上に位置するか、または、同一である場合、第１のビットストリームと第２のビットストリームを用いた編集は可能であると判断することを特徴とする。
【００５４】
本発明の編集方法は、入力されたビットストリーム内のヘッダを検索する検索ステップと、検索ステップの処理により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出した情報に基づいてビットストリームの編集が可能であるか否かを判断する判断ステップと、判断ステップの処理でビットストリームの編集が可能であると判断された場合、ビットストリームの編集を行う編集ステップとを含み、判断ステップの処理は、第１のビットストリームのヘッダに含まれる情報により作成される特性曲線が、第２のビットストリームのヘッダに含まれる情報により作成される特性曲線の常に上に位置するか、または、同一である場合、第１のビットストリームと第２のビットストリームを用いた編集は可能であると判断することを特徴とする。
【００５５】
本発明の第３の記録媒体のプログラムは、入力されたビットストリーム内のヘッダを検索する検索ステップと、検索ステップの処理により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出した情報に基づいてビットストリームの編集が可能であるか否かを判断する判断ステップと、判断ステップの処理でビットストリームの編集が可能であると判断された場合、ビットストリームの編集を行う編集ステップとを含み、判断ステップの処理は、第１のビットストリームのヘッダに含まれる情報により作成される特性曲線が、第２のビットストリームのヘッダに含まれる情報により作成される特性曲線の常に上に位置するか、または、同一である場合、第１のビットストリームと第２のビットストリームを用いた編集は可能であると判断することを特徴とする。
【００５６】
本発明の第３のプログラムは、入力されたビットストリーム内のヘッダを検索する検索ステップと、検索ステップの処理により検索されたヘッダに含まれるバッファに関するバッファ特性情報を読み出し、その読み出した情報に基づいてビットストリームの編集が可能であるか否かを判断する判断ステップと、判断ステップの処理でビットストリームの編集が可能であると判断された場合、ビットストリームの編集を行う編集ステップとを含む処理をコンピュータに実行させ、判断ステップの処理は、第１のビットストリームのヘッダに含まれる情報により作成される特性曲線が、第２のビットストリームのヘッダに含まれる情報により作成される特性曲線の常に上に位置するか、または、同一である場合、第１のビットストリームと第２のビットストリームを用いた編集は可能であると判断することを特徴とする。
【００５７】
本発明の符号化装置および方法、並びに第１のプログラムにおいては、ビットストリームに符号化されて多重化されるヘッダに、そのビットストリームを復号する際のバッファに関するバッファ特性の情報が含まれる。
【００５８】
本発明の復号装置および方法、並びに第２のプログラムにおいては、入力されたビットストリームのヘッダに含まれる、復号時のバッファに関するバッファ特性の情報が読み出され、その読み出された情報に基づき復号が行われる。
【００５９】
本発明の編集装置および方法、並びに第３のプログラムにおいては、入力されたビットストリームに対して編集が行えるか否かの判断が、第１のビットストリームのヘッダに含まれる情報により作成される特性曲線が、第２のビットストリームのヘッダに含まれる情報により作成される特性曲線の常に上に位置するか、または、同一であるかを判断することにより行われる。
【００６０】
【発明の実施の形態】
以下に、本発明の実施の形態について図面を参照して説明する。図５は、本発明を適用した符号化装置の一実施の形態の構成を示す図である。図５に示した符号化装置７０は、図１に示した画像情報符号化装置１０を含む構成とされている。ここでは、画像情報符号化装置１０の構成などについては、既に説明したので、その説明は適宜省略する。
【００６１】
画像情報符号化装置１０に入力された画像情報は、符号化され、画像圧縮情報（ＢＳ：ビットストリーム）としてバッファ７１とビットストリーム解析部７２に出力される。バッファ７１は、入力されたビットストリームを一旦記憶し、必要に応じ、バッファ情報付加部７３に出力する。ビットストリーム解析部７２は、ビットストリーム中の所定の区間、例えば、ＧＯＰやランダムアクセスポイント間でのバッファの占有状態を調べ、その情報をバッファ情報ＢＨとしてバッファ情報付加部７３に供給する。ここで、ランダムアクセスポイントとは、ＪＶＴ規格において、ビットストリーム中でランダムにアクセスが可能な所定の区間のことを言う。また、同様にＧＯＰとは、ＭＰＥＧ２／ＭＰＥＧ４規格において、ランダムにアクセスが可能な所定の区間のことを言う。
【００６２】
バッファ情報付加部７３は、入力されたバッファ情報ＢＨを、同じく入力されたビットストリームに付加して出力する。
【００６３】
ここでは、ビットストリーム解析部７２が行う解析の一例として、各ランダムアクセスポイント間でバッファ占有状態を調べ、各ランダムアクセスポイントにヘッダ情報としてバッファ占有状態の情報を符号化してビットストリームを構成する場合を例にあげて説明する。ここでは、このような説明を行うが、ＧＯＰ単位で求めるようにしても良いし、他の任意の単位で求めるようにしても良く、以下に説明する単位に、他の単位を用いた場合においても、本発明を適用できることは言うまでもない。
【００６４】
図６を参照して（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ）の特性を決定する方法について説明する。ここで、Ｒ_ｍｉｎは、バッファへの入力ビットレートＲの最小値を示し、Ｂ_ｍｉｎは、バッファサイズＢの最小値を示すとする。
【００６５】
所定のビットストリームのビットレートＲが与えられた場合、そのビットストリームを復号ビットレートＲで復号する復号装置（例えば、図７に示す構成を有する）で復号可能である最低限のバッファサイズＢ_ｍｉｎは、例えば、以下のようにして決定される。
【００６６】
所定のアクセスポイント間のフレーム数をＮとする。各フレームの発生ビット量をｂ（ｉ）（ｉ＝１，Ｎ）、バッファから各フレームのデータを引き抜く直前のバッファ占有量をＢ（ｉ）、引き抜いた直後のバッファ占有量をＢ２（ｉ）とする。符号化装置のバッファ量をＢとすれば、
Ｂ２（ｉ）＝Ｂ（ｉ）―ｂ（ｉ）
Ｂ（ｉ＋１）＝Ｂ２（ｉ）＋Ｒ／（Ｆｒａｍｅ　Ｒａｔｅ）　・・・（５）
ただし、ｉｆ（Ｂ（ｉ＋１）＞Ｂ）Ｂ（ｉ＋１）＝Ｂとし、Ｂ（ｉ）の最大値はＢである。また遅延量ＦはＦ＝Ｂとする。
【００６７】
このとき、Ｂ_ｍｉｎは、次式（６）で求められる。
Ｂ_ｍｉｎ＝Ｂ―ｍｉｎ（Ｂ２（ｉ））　　・・・（６）
このときのＲをＲ_ｍｉｎとすれば、上記のような方法により（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ）を決定することができる。
【００６８】
次に、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）を決定する方法の一例を説明する。Ｂ＝Ｂ_ｍｉｎ、Ｒ＝Ｒ_ｍｉｎとする。式（５）と同様に、次式（７）が成り立つ。
Ｂ２（ｉ）＝Ｂ（ｉ）−ｂ（ｉ）
Ｂ（ｉ＋１）＝Ｂ２（ｉ）＋Ｒ／（Ｆｒａｍｅ　Ｒａｔｅ）　・・・（７）
となる。ただし、以下の条件に基づくアンダーフローに対する監視が行われる。

【００６９】
Ｆ_ｍｉｎは、各ランダムアクセスポイントの先頭で０に初期化される。また、オーバーフローに対する監視も同様に、以下の条件に基づき行われる。
ｉｆ（Ｂ（ｉ＋１）＞Ｂ）Ｂ（ｉ＋１）＝Ｂ
ランダムアクセスポイント間の全てのフレームに対して上記した検査が行われることにより、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）が決定される。
【００７０】
上記した（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）は、予め定められた所定の個数だけ検査を行うようにしても良いし、その中で独立な組み合わせのみを定義するようにしても良い。上記のようにして求められた特性は、図４に示すようになる。各点の間は線形補間される。上記のようにして求められた、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）の値、バッファ情報ＢＨは、バッファ情報付加部７３によりビットストリーム中の所定の位置に挿入され、符号化され出力される。
【００７１】
ビットストリーム解析部７２は、上述したような、各ランダムアクセス間の（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）と同時にビットストリーム全体に対して同様の解析を行い、ビットストリーム全体に対する特性、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌを決定し、この値を、バッファ情報付加部７３に、バッファ情報ＢＨとして供給する。
【００７２】
画像情報符号化装置１０から出力されたビットストリームＢＳは、バッファ７１において所定の時間だけ遅延された後、バッファ情報付加部７３に入力される。バッファ情報付加部７３は、ビットストリーム中の所定の位置にビットストリーム解析部７２より供給されるバッファ情報ＢＨを挿入し、最終的な出力ビットストリームＢＳを出力する。
【００７３】
ここで、バッファ情報ＢＨ（若しくはバッファ特性情報）は、例えば、（Ｒ_ｍｉ _ｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）や（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌである。バッファ情報付加部７３は、ビットストリームＢＳ中の所定の位置に、上記情報を挿入する。ここでシンタクスの一例を以下に示し説明する。
【００７４】

【００７５】
ランダムアクセスポイント間の（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）は、例えば、その直前のランダムアクセスポイントヘッダに、上記したシンタクスのように記録される。ＲＡＰ＿ｓｔａｒｔｃｏｄｅは、ＲＡＰヘッダが存在し、そのヘッダの開始を示すコードである。
【００７６】
ｃｌｏｓｅｄ＿ＧＯＰは、そのＧＯＰ内の全てのピクチャが他のＧＯＰのピクチャを参照することがなく独立であるか、または、他のＧＯＰのピクチャを参照するという依存関係があるかどうかを示すフラグである。ｂｒｏｋｅｎ＿ｌｉｎｋは、編集などにより、そのＧＯＰの前後でビットストリームの置き換えが行われた場合、予測の参照画像が存在するか否かを示すフラグである。
【００７７】
ＮｕｍＢｕｆｆｅｒ＿Ｐａｒａｍは、求めた特性セット（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）の数を示す。Ｒａｔｅ［ｉ］、Ｂｕｆｆｅｒ［ｉ］、Ｆ［ｉ］は、それぞれをＲ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ示す。ここでは、例えば、Ｒ_ｍｉｎは、小さいものから順に記録される。
【００７８】
ビットストリーム全体の（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌは、例えば、そのビットストリームの先頭のシーケンスヘッダに、以下のシンタクスのようにして記録される。
【００７９】

【００８０】
ここで、ＮｕｍＢｕｆｆｅｒ＿Ｐａｒａｍは、求めた特性セット（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌの数を示す。Ｒａｔｅ［ｉ］、Ｂｕｆｆｅｒ［ｉ］、Ｆ［ｉ］は、それぞれをＲ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎを示す。ここでは、例えば、Ｒ_ｍｉｎは、小さいものから順に記録される。
【００８１】
バッファ情報付加部７３において、上記のバッファ情報ＢＨが付加された後、最終的な出力ビットストリームＢＳが出力される。
【００８２】
なお、発明の実施の形態ではバッファ情報ＢＨとして、最小ビットレートＲｍｉｎ、最小バッファサイズＢｍｉｎおよび最小遅延量Ｆｍｉｎの全てをビットストリームに付加するように説明した。しかし、この例に限らず、最小ビットレートＲｍｉｎ、最小バッファサイズＢｍｉｎ若しくは最小遅延量Ｆｍｉｎのうち、少なくとも一つをビットストリームに加えるようにしてもよい。例えば、最小ビットレートＲｍｉｎおよび最小バッファサイズＢｍｉｎの組み合わせをビットストリームに付加するようにしてもよい。
【００８３】
図７に本発明を適用した復号装置の一実施の形態の構成を示す。図７に示した復号装置９０は、図５に示した符号化装置７０に対応するものであり、内部に、図２に示した画像情報復号装置４０を含んでいる。復号装置９０に入力されたビットストリームＢＳは、ビットストリーム解析部９１と復号可能性判定部９２に供給される。
【００８４】
ビットストリーム解析部９１は、ビットストリーム中のバッファ情報ＢＨを復号し、復号可能性判定部９２に出力する。ビットストリーム解析部９１は、ビットストリームをパースし、シーケンスヘッダに記録されている、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌを復号する。また、各ランダムアクセスポイントヘッダに記録されている、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）を復号する。これら情報が復号可能性判定部９２に出力される。
【００８５】
復号可能性判定部９２は、バッファ情報ＢＨおよび画像情報復号装置４０より供給されるデコーダ情報ＤＩに基づいて、入力されたビットストリームがバッファを破綻させること無く復号可能であるかどうかを判定する。デコーダ情報ＤＩは、例えば、デコーダバッファサイズおよび復号ビットレートなどである。
【００８６】
復号可能性判定部９２は、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌから、図４に示したような特性曲線を作成する。各点の間は線形補間する。この時、デコーダ（復号装置９０）のバッファおよび復号ビットレートが（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌにより作られる特性曲線より上に位置する場合、入力されたビットストリームは、復号可能であると判断することが可能である。従ってこのようなとき、復号可能性判定部９２は、復号可能であると判定し、ビットストリームを画像情報復号装置４０に供給する。
【００８７】
画像情報復号装置４０は、図２に示した画像情報復号装置４０と基本的に同様な構成により、同様な処理を実行し、入力されたビットストリームを復号し、画像情報を図示されていないテレビジョン受像機などに出力する。
【００８８】
ビットストリーム全体を復号可能であるかどうかは上記のように、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）ｇｌｏｂａｌの特性曲線、デコーダバッファサイズ、復号ビットレートを調べることによって判定することが可能である。
【００８９】
また、ランダムアクセスなどにより、所定のランダムアクセスポイントから特定の区間のみを復号したい場合、同様にして、復号可能性判定部９２は、（Ｒ_ｍｉ _ｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）から図４に示すような特性曲線を作成する。各点の間は線形補間する。この時、デコーダのバッファおよび復号ビットレートが（Ｒ_ｍｉｎ、Ｂ_ｍ _ｉｎ、Ｆ_ｍｉｎ）により作られる特性曲線より上に位置する場合、ビットストリームは復号可能である。従ってこのようなとき、復号可能性判定部９２は、復号可能であると判定し、ビットストリームを画像情報復号装置４０に供給する。
【００９０】
次にビットストリームの編集を行う際の説明を行う。図８は、本発明を適用したビットストリームの編集を行う編集装置１１０の一実施の形態の構成を示す図である。編集装置１１０が行う編集の例として、入力ビットストリーム１の一部を、別の入力ビットストリーム２に置き換えるスプライスを行う場合を例に挙げて説明する。
【００９１】
ここで、スプライスについて簡単に説明するに、スプライスとは、所定のビットストリームをランダムアクセスポイントにおいて別のビットストリームに置き換えて編集を行うことである。このようなスプライスは、例えば、テレビジョン放送の番組に、コマーシャルの放送を挿入する際などである。この場合、入力ビットストリーム１がテレビジョン放送の番組のビットストリームであり、入力ビットストリーム２がコマーシャルのビットストリームである。
【００９２】
入力ビットストリーム１は、ビットストリーム解析部１１１−１に入力され、入力ビットストリーム２は、ビットストリーム解析部１１１−２に入力される。ビットストリーム解析部１１１−１，１１１−２は、それぞれ入力されたビットストリーム１，２中に含まれているバッファ情報ＢＨ１，２を復号し、ビットストリーム編集部１１２に出力する。
【００９３】
ビットストリーム編集部１１２は、バッファ情報ＢＨ１，２に基づき、所定の編集ポイントで、入力ビットストリーム１に対して入力ビットストリーム２を挿入可能であるか否かを判定する。この時、編集後のビットストリームが、デコーダ（復号装置９０）のバッファを破綻させずに復号可能であるためには、ランダムアクセスポイントとその直前のバッファ占有量の値が同一であるという条件が必要である。
【００９４】
ＭＰＥＧ２，４方式を用いるデコーダは、特定のビットレート、バッファサイズで動作することが想定されていたが、ＪＶＴ方式を用いるデコーダにおいては、図４に示すように、その他のビットレート、バッファサイズであっても、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）の特性曲線より上にある場合、復号することが可能であるようにバッファに対する制約が緩和されている。
【００９５】
ビットストリームの編集により、その編集前後でデコード可能性が変化しないようにするためには、編集区間の（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）が同一であれば良い。従って、ビットストリーム編集部１１２は、編集区間に位置するランダムアクセスポイントヘッダにおける（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）特性を、入力ビットストリーム１，２に対して作成し、これらの値が一致する場合、その区間をビットストリーム２に置換する。一致しない場合、ビットストリーム１または２に対してパディングビットを挿入して、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）が一致するようにした後、入力ビットストリーム２に置換する。
【００９６】
ＪＶＴにおいては、バッファに対する規制が緩和されているが、このことを利用すれば、スプライスにおけるバッファの適合条件を緩和することが可能になる。ＪＶＴにおいては、デコーダのバッファサイズおよび復号ビットレートが（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）の上に位置する場合、復号可能であることがわかる。従って、元の入力ビットストリーム１の所定の編集区間の（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）に対して、挿入する入力ビットストリーム２の所定編集区間の（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）が常に下にある場合、入力ビットストリーム１を復号可能なデコーダは、その区間をビットストリーム２に置換しても復号可能であることになる。
【００９７】
図９にその関係を図示する。曲線１は、入力ビットストリーム１の編集区間での（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）特性を示す。曲線２は入力ビットストリーム２の編集区間での（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）特性を示す。デコーダのバッファ、復号ビットレートが、この曲線の上に来る場合、復号可能であることから、図９に示すように曲線２が常に曲線１の下に来るとき、復号可能であることが保証される。
【００９８】
従って、ビットストリーム編集部１１２は、編集区間に位置するランダムアクセスポイントヘッダにおける（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）特性を、ビットストリーム１，２に対して作成し、ビットストリーム２の特性曲線が、ビットストリーム１の特性曲線の下に来る場合、その区間をビットストリーム２に置換する。
【００９９】
逆に、一致しないような場合、ビットストリーム１または２に対してパディングビットを挿入して、ビットストリーム２の（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）特性曲線が、ビットストリーム１の特性曲線の下に位置するように変更した後、入力ビットストリーム２に置換する。
【０１００】
このような条件を満たすようにスプライスを行った場合、ビットストリーム１を復号可能なデコーダを破綻させることはない。ビットストリーム編集部１１２はスプライスをした後、最終的なビットストリームを出力する。
【０１０１】
このように、ビットストリーム中のランダムアクセスが行えるポイントのヘッダに、（Ｒ_ｍｉｎ、Ｂ_ｍｉｎ、Ｆ_ｍｉｎ）といった最小ビットレート、最小バッファサイズ、最小初期遅延時間などの情報を含ませることにより、復号側において、ビットストリームの復号可能性を効率良く判断することが可能となり、また、スプライシングなどのビットストリームの編集を容易に、かつ、復号側のバッファを破綻させることなく復号が常に行えるようにすることが可能となる。
【０１０２】
図１０は、汎用のパーソナルコンピュータの内部構成例を示す図である。パーソナルコンピュータのＣＰＵ（Ｃｅｎｔｒａｌ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）２１１は、ＲＯＭ（Ｒｅａｄ　Ｏｎｌｙ　Ｍｅｍｏｒｙ）２１２に記憶されているプログラムに従って各種の処理を実行する。ＲＡＭ（Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）２１３には、ＣＰＵ２１１が各種の処理を実行する上において必要なデータやプログラムなどが適宜記憶される。入出力インタフェース２１５は、キーボードやマウスから構成される入力部２１６が接続され、入力部２１６に入力された信号をＣＰＵ２１１に出力する。また、入出力インタフェース２１５には、ディスプレイやスピーカなどから構成される出力部７も接続されている。
【０１０３】
さらに、入出力インタフェース２１５には、ハードディスクなどから構成される記憶部２１８、および、インターネットなどのネットワークを介して他の装置とデータの授受を行う通信部２１９も接続されている。ドライブ２２０は、磁気ディスク２３１、光ディスク２３２、光磁気ディスク２３３、半導体メモリ２３４などの記録媒体からデータを読み出したり、データを書き込んだりするときに用いられる。
【０１０４】
記録媒体は、図１０に示すように、パーソナルコンピュータとは別に、ユーザにプログラムを提供するために配布される、プログラムが記録されている磁気ディスク２３１（フレキシブルディスクを含む）、光ディスク２３２（ＣＤ−ＲＯＭ（Ｃｏｍｐａｃｔ　Ｄｉｓｃ−Ｒｅａｄ　Ｏｎｌｙ　Ｍｅｍｏｒｙ），ＤＶＤ（Ｄｉｇｉｔａｌ　Ｖｅｒｓａｔｉｌｅ　Ｄｉｓｃ）を含む）、光磁気ディスク２３３（ＭＤ（Ｍｉｎｉ−Ｄｉｓｃ）（登録商標）を含む）、若しくは半導体メモリ２３４などよりなるパッケージメディアにより構成されるだけでなく、コンピュータに予め組み込まれた状態でユーザに提供される、プログラムが記憶されているＲＯＭ２１２や記憶部２１８が含まれるハードディスクなどで構成される。
【０１０５】
なお、本明細書において、媒体により提供されるプログラムを記述するステップは、記載された順序に従って、時系列的に行われる処理は勿論、必ずしも時系列的に処理されなくとも、並列的あるいは個別に実行される処理をも含むものである。
【０１０６】
また、本明細書において、システムとは、複数の装置により構成される装置全体を表すものである。
【０１０７】
【発明の効果】
以上の如く本発明の符号化装置および方法、並びに第１のプログラムによれば、ビットストリームに符号化されて多重化されるヘッダに、そのビットストリームを復号する際のバッファに関するバッファ特性の情報を含ませるようにしたので、復号側で、バッファが破綻してしまうようなことを防ぐことが可能となる。
【０１０８】
また、本発明の復号装置および方法、並びに第２のプログラムによれば、入力されたビットストリームのヘッダに含まれる、復号時のバッファに関するバッファ特性の情報が読み出され、その読み出された情報に基づき復号が行われるようにしたので、復号時にバッファが破綻してしまうようなことを防ぐことが可能となる。
【０１０９】
さらに、本発明の編集装置および方法、並びに第３のプログラムによれば、入力されたビットストリームに対して編集が行えるか否かの判断を、第１のビットストリームのヘッダに含まれる情報により作成される特性曲線が、第２のビットストリームのヘッダに含まれる情報により作成される特性曲線の常に上に位置するか、または、同一であるかを判断することにより行うようにしたので、スプライスなどの編集にかかる処理を軽減させ、容易に編集可能であるか否かを判断することが可能となる。
【図面の簡単な説明】
【図１】従来の画像情報符号化装置の一例の構成を示す図である。
【図２】従来の画像情報復号装置の一例の構成を示す図である。
【図３】バッファ量について説明する図である。
【図４】ビットレートとバッファ量の関係について説明する図である。
【図５】本発明を適用した符号化装置の一実施の形態の構成を示す図である。
【図６】バッファ量について説明する図である。
【図７】本発明を適用した復号装置の一実施の形態の構成を示す図である。
【図８】本発明を適用した編集装置の一実施の形態の構成を示す図である。
【図９】ビットレートとバッファ量の関係について説明する図である。
【図１０】媒体を説明する図である。
【符号の説明】
７０　符号化装置，　７１　バッファ，　７２　ビットストリーム解析部，　７３　バッファ情報付加部，　９０　復号装置，　９１　ビットストリーム解析部，　９２　復号可能性判定部，　１１１　ビットストリーム解析部，　１１２ビットストリーム編集部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an encoding device and method, a decoding device and method, an editing device and method, a recording medium, and a program, and in particular, image information compressed by orthogonal transform and motion compensation such as discrete cosine transform or Karhunen-Loeve transform. A code suitable for use when transmitting / receiving a bit stream) via a network medium such as satellite broadcast, cable television broadcast, or the Internet, or when processing on a storage medium such as an optical disk, a magnetic disk, or a flash memory. The present invention relates to an encoding device and method, a decoding device and method, an editing device and method, a recording medium, and a program.
[0002]
[Prior art]
In recent years, image information is handled as digital, and at that time, MPEG (compressed by orthogonal transformation such as discrete cosine transformation and motion compensation is used for the purpose of efficient transmission and storage of information, and using redundancy unique to image information. A device compliant with a system such as Moving Picture Expert Group) is becoming popular in both information distribution such as broadcasting stations and information reception in general homes.
[0003]
In particular, MPEG2 (ISO / IEC 13818-2) is a standard defined as a general-purpose image compression system, and is a standard that covers both interlaced scanning images and sequential scanning images, as well as standard resolution images and high-definition images. As represented by the DVD (Digital Versatile Disk) standard, it is widely used in a wide range of applications for professional use and consumer use.
[0004]
By using this MPEG2 compression method, for example, 4 to 8 Mbps for a standard resolution interlaced scanning image having 720 × 480 pixels, and 18 to 22 Mbps for a high resolution interlaced scanning image having 1920 × 1088 pixels, for example. By assigning a code amount (bit rate), it is possible to realize a high compression rate and good image quality.
[0005]
MPEG2 was mainly intended for high-quality encoding suitable for broadcasting, but because it did not support encoding methods with higher compression rates, the MPEG4 encoding method was standardized. Regarding the image encoding system, the standard was approved as an international standard as ISO / IEC 14496-2 in December 1998.
[0006]
Further, in recent years, with the initial purpose of image coding for video conferencing, the ITU-T (International Telecommunication Union-Telecommunication Standardization Sector), which is the telecommunications standardization department of the International Telecommunications Union, has been developed. The standardization of 26L (ITU-T Q6 / 16VCEG) is in progress. H. 26L is known to achieve higher encoding efficiency than the encoding methods such as MPEG2 and MPEG4, although a larger amount of calculation is required for encoding and decoding.
[0007]
In addition, as part of MPEG4 activities, this H.264 The standardization of coding technology that realizes higher coding efficiency based on H.26L is being carried out jointly with ITU-T as JVT (Joint Video Team).
[0008]
Here, image compression by orthogonal transform such as discrete cosine transform or Karhunen-Loeve transform and motion compensation will be described. FIG. 1 is a diagram illustrating a configuration of an example of a conventional image information encoding device.
[0009]
In the image information encoding apparatus 10 shown in FIG. 1, image information composed of an analog signal input from the input terminal 11 is converted into a digital signal by the A / D conversion unit 12. The screen rearrangement buffer 13 rearranges the frames according to the GOP (Group of Pictures) structure of the image information supplied from the A / D conversion unit 12.
[0010]
Here, the screen rearrangement buffer 13 supplies image information of the entire frame to the orthogonal transform unit 15 for an image on which intra (intra-image) encoding is performed. The orthogonal transform unit 15 performs orthogonal transform such as discrete cosine transform or Karhunen-Loeve transform on the image information, and supplies transform coefficients to the quantization unit 16. The quantization unit 16 performs a quantization process on the transform coefficient supplied from the orthogonal transform unit 15.
[0011]
The lossless encoding unit 17 determines an encoding mode from the quantized transform coefficient, quantization scale, and the like supplied from the quantization unit 16, and performs variable length encoding or arithmetic encoding on the encoding mode. The information to be inserted into the header portion of the image coding unit is formed. Then, the lossless encoding unit 17 supplies the encoded encoding mode to the accumulation buffer 18 for accumulation. The encoded encoding mode is output from the output terminal 19 as image compression information.
[0012]
The lossless encoding unit 17 performs lossless encoding such as variable length encoding or arithmetic encoding on the quantized transform coefficient, and supplies the encoded transform coefficient to the accumulation buffer 18 for accumulation. Let The encoded transform coefficient is output from the output terminal 19 as image compression information.
[0013]
The behavior of the quantization unit 16 is controlled by the rate control unit 20 based on the data amount of the transform coefficient accumulated in the accumulation buffer 18. Further, the quantization unit 20 supplies the quantized transform coefficient to the inverse quantization unit 21, and the inverse quantization unit 21 inversely quantizes the quantized transform coefficient. The inverse orthogonal transform unit 22 performs inverse orthogonal transform processing on the inversely quantized transform coefficients to generate decoded image information, and supplies the information to the frame memory 23 for accumulation.
[0014]
In addition, the screen rearrangement buffer 13 supplies image information to the motion prediction / compensation unit 24 regarding an image on which inter (inter-image) encoding is performed. The motion prediction / compensation unit 24 extracts image information that is referred to at the same time from the frame memory 23, and performs motion prediction / compensation processing to generate reference image information. The motion prediction / compensation unit 24 supplies the generated reference image information to the adder 14, and the adder 14 converts the reference image information into a difference signal from the corresponding image information. In addition, the motion prediction / compensation unit 24 supplies motion vector information to the lossless encoding unit 17 at the same time.
[0015]
The lossless encoding unit 17 determines the encoding mode from the quantized transform coefficient and quantization scale supplied from the quantization unit 16, the motion vector information supplied from the motion prediction / compensation unit 24, and the like. The coding mode is subjected to lossless coding such as variable length coding or arithmetic coding, and information to be inserted into the header portion of the image coding unit is generated. Then, the lossless encoding unit 17 supplies the encoded encoding mode to the accumulation buffer 18 for accumulation. The encoded encoding mode is output as image compression information.
[0016]
Further, the lossless encoding unit 17 performs lossless encoding processing such as variable length encoding or arithmetic encoding on the motion vector information, and generates information to be inserted into the header portion of the image encoding unit.
[0017]
In contrast to intra coding, in the case of inter coding, the image information input to the orthogonal transform unit 15 is a difference signal obtained from the adder 14. The other processing is the same as the image compression information subjected to intra coding, and therefore the description thereof is omitted.
[0018]
Next, FIG. 2 shows a configuration of an example of an image information decoding device corresponding to the image information encoding device 10 described above. In the image information decoding apparatus 40 shown in FIG. 2, the image compression information input from the input terminal 41 is temporarily stored in the accumulation buffer 42 and then transferred to the lossless decoding unit 43.
[0019]
The lossless decoding unit 43 performs processing such as variable length decoding or arithmetic decoding on the compressed image information based on the determined format of the compressed image information, acquires the encoding mode information stored in the header portion, and performs inverse quantum To the control unit 44 and the like. Similarly, the lossless decoding unit 43 acquires the quantized transform coefficient and supplies it to the inverse quantization unit 44. Furthermore, when the frame to be decoded is inter-coded, the lossless decoding unit 43 also decodes the motion vector information stored in the header portion of the image compression information, and the information is motion prediction / compensation unit. 51.
[0020]
The inverse quantization unit 44 inversely quantizes the quantized transform coefficient supplied from the lossless decoding unit 43 and supplies the transform coefficient to the inverse orthogonal transform unit 45. The inverse orthogonal transform unit 45 performs inverse orthogonal transform such as inverse discrete cosine transform or inverse Karhunen-Labe transform on the transform coefficient based on the determined format of the image compression information.
[0021]
Here, when the target frame is intra-coded, the image information subjected to the inverse orthogonal transform process is stored in the screen rearrangement buffer 47, and the D / A conversion in the D / A conversion unit 48 is performed. It is output from the output terminal 49 after processing.
[0022]
When the target frame is inter-coded, the motion prediction / compensation unit 51 refers to the motion vector information subjected to the lossless decoding process and the image information stored in the frame memory 50. An image is generated and supplied to the adder 46. The adder 46 combines the reference image and the output from the inverse orthogonal transform unit 45. The other processing is the same as that of the intra-encoded frame, and thus description thereof is omitted.
[0023]
By the way, in the encoding system (hereinafter referred to as JVT Codec) standardized by the above-mentioned Joint Video Team, various systems are being studied in order to improve the encoding efficiency of MPEG2 and MPEG4. For example, the conversion method of the discrete cosine transform uses integer coefficient conversion of 4 × 4 block size. The block size at the time of motion compensation is variable, and more optimal motion compensation can be performed. However, the basic method can be performed in the same manner as the encoding method performed in the image information encoding device 10 shown in FIG.
[0024]
Therefore, it is possible to perform decoding by basically the same method as the decoding method performed in the image information decoding device 40 shown in FIG.
[0025]
By the way, in order to maintain compatibility between different decoding apparatuses (decoders) and prevent the buffer from overflowing or underflowing, a buffer model is introduced in MPEG and ITU-T. A virtual decoder buffer model is defined as a standard, and the encoding device (encoder) encodes the virtual decoder buffer so that it does not fail, thereby preventing buffer overflow or underflow on the decoder side and maintaining compatibility. Is possible.
[0026]
A virtual buffer model in MPEG will be described with reference to FIG. In the following description, the input bit rate to the decoder buffer is R, the size of the decoder buffer is B, the buffer occupation amount when the decoder pulls out the first frame from the buffer is F, and the delay time at that time is D. Further, the bit amount of each frame at times t0, t1, t2,... Is b0, b1, b2,.
[0027]
If the frame rate is M,
t_{i + 1}-T_i= 1 / M holds.
[0028]
B_iAt time t_iBit amount b of frame_iThe following equation (1) is established if the buffer occupancy immediately before is extracted.
B₀= F
B_{i + 1}= Min (B, B_i―B_i+ R (t_{i + 1}-T_i)) ... (1)
[0029]
Here, in the case of the constant bit rate encoding method in MPEG2, the encoder must encode so as to satisfy the condition of the following equation (2).
Bi ≦ B
Bi-bi ≧ 0 (2)
While such a condition is satisfied, the encoder does not perform encoding that causes a buffer overflow or underflow.
[0030]
Further, in the case of the variable bit rate encoding method in MPEG2, the input bit rate R is the maximum bit rate defined by the profile and level, and F = B. Therefore, the equation (1) can be rewritten as the following equation (3).
B₀= B
B_{i + 1}= Min (B, B_i―B_i+ R_max(T_{i + 1}-T_i)) ... (3)
[0031]
At this time, the encoder must execute encoding so as to satisfy the condition expressed by the following equation (4).
B_i―B_i≧ 0 (4)
When this condition is satisfied, the encoder performs encoding so that no buffer underflow occurs on the decoder side. When the decoder buffer is full, it means that the encoder buffer is empty and no encoded bit stream has been generated. Thus, the encoder need not monitor the decoder for buffer overflow.
[0032]
In MPEG, encoding is performed so as to observe the above-described buffer restrictions based on the buffer size and bit rate defined by each profile and level. A decoder conforming to each profile and level can decode the bitstream without breaking it.
[0033]
[Problems to be solved by the invention]
However, there are cases where the bit stream can be decoded even when the buffer size and the bit rate specified in the profile and level are not actually used.
[0034]
For example, a bit stream encoded with a bit rate R, a buffer B, and an initial delay time F (R, B, F) can be decoded even by a decoder having a larger buffer size B ′ (B ′> B). . It is also possible to decode at a higher bit rate R ′ (R ′> R).
[0035]
For example, even when the decoding bit rate of the decoder is lower than the encoding bit rate, it is possible to decode the decoder having a sufficiently large buffer size.
[0036]
Thus, given a given bitstream, at each bitrate, the minimum buffer size B required to decode that bitstream_minExists. Such a relationship is shown in FIG.
[0037]
JVT Codec is being standardized so that it can be decoded not only by a fixed bit rate and buffer size for each profile and level, but also by a decoder having the conditions shown in FIG. The purpose is to be able to perform decoding even if the encoding bit rate and buffer size of the encoder are not the same as the decoding bit rate and buffer size of the decoder. By achieving this object, for example, a decoder having a high decoding bit rate can reduce the buffer size.
[0038]
However, such information varies in time in the bitstream. For this reason, there is a problem that even if decoding is possible under a predetermined condition, decoding may not be possible under another condition because restrictions for decoder compatibility are relaxed. For example, when such (R, B) characteristics fluctuate with time, there is a problem that even if decoding is possible at a predetermined time, decoding may not be possible at another time.
[0039]
There has been a problem that even when moving to another scene or another channel due to random access or the like, decoding is not always possible. In addition, when editing at the bit stream level, such as splicing, there is a problem that it is impossible to guarantee the decoding possibility.
[0040]
The present invention has been made in view of such a situation, and an object of the present invention is to efficiently determine the decodability of a bitstream and to easily edit a bitstream such as splicing.
[0041]
[Means for Solving the Problems]
An encoding apparatus according to the present invention includes a generation unit that generates a header to be referred to as necessary at the time of decoding, a header generated by the generation unit, an encoding unit that encodes an input image signal, and an encoding And an output means for outputting a bit stream by multiplexing the header encoded by the means and an image signal, and the generating means generates a header including buffer characteristic information relating to a buffer when decoding the bit stream. And
[0042]
The generation unit may generate the header including buffer characteristic information for each predetermined section that can be accessed randomly in the bitstream.
[0043]
The generating means may generate a header including information on buffer characteristics of the entire sequence of the bitstream.
[0044]
The buffer characteristic information may include all of the minimum decodable bit rate, the minimum buffer size, and the minimum delay amount when decoding the bitstream.
[0045]
The encoding method of the present invention includes a generation step for generating a header to be referred to as necessary at the time of decoding, a header generated by the processing of the generation step, and an encoding step for encoding each input image signal, An output control step for controlling the output of the bit stream obtained by multiplexing the header and the image signal encoded by the processing of the encoding step, and the processing of the generation step includes buffer characteristic information relating to a buffer when decoding the bit stream A header including at least the above is generated.
[0046]
The program of the first recording medium of the present invention includes a generation step for generating a header to be referred to as necessary at the time of decoding, a header generated by the processing of the generation step, and a code for encoding the input image signal. And an output control step for controlling the output of the bit stream obtained by multiplexing the header and the image signal encoded in the encoding step, and the generation step includes a buffer for decoding the bit stream. Generating a header including at least buffer characteristic information.
[0047]
The first program of the present invention includes a generation step for generating a header to be referred to as necessary at the time of decoding, a header generated by the processing of the generation step, and an encoding step for encoding each input image signal. , Causing the computer to execute a process including a header encoded in the process of the encoding step and an output control step for controlling the output of the bit stream obtained by multiplexing the image signal, and the process of the generation step decodes the bit stream A header including at least buffer characteristic information relating to the current buffer is generated.
[0048]
The decoding device according to the present invention reads search means for searching for a header in an input bitstream, and reads buffer characteristic information relating to a buffer included in the header searched by the search means, and based on the read buffer characteristic information, And a decoding means for decoding the stream.
[0049]
The buffer characteristic information may include all of the minimum decodable bit rate, the minimum buffer size, and the minimum delay amount when decoding the bitstream.
[0050]
According to the decoding method of the present invention, a search step for searching for a header in an input bitstream, and buffer characteristic information relating to a buffer included in the header searched by the processing of the search step are read, and based on the read buffer characteristic information And a decoding step of decoding the bitstream.
[0051]
The program of the second recording medium of the present invention reads the buffer step information for searching the header in the input bitstream, and the buffer characteristic information related to the buffer included in the header searched by the processing of the search step. And a decoding step of decoding the bitstream based on the buffer characteristic information.
[0052]
The second program of the present invention reads a buffer step for searching for a header in an input bitstream, and buffer characteristic information relating to a buffer included in the header searched by the processing of the search step, and the read buffer characteristic information And a decoding step of decoding the bitstream based on the computer.
[0053]
The editing apparatus of the present invention reads search means for searching for a header in an input bitstream, and buffer characteristic information relating to a buffer included in the header searched by the search means, and based on the read information, A determination unit that determines whether or not editing is possible; and an editing unit that edits the bitstream when the determination unit determines that the bitstream can be edited. If the characteristic curve created by the information contained in the header of the second bitstream is always above or identical to the characteristic curve created by the information contained in the header of the second bitstream, It is determined that editing using one bit stream and a second bit stream is possible.
[0054]
The editing method of the present invention includes a search step for searching for a header in an input bitstream, and buffer characteristic information relating to a buffer included in the header searched by the processing of the search step, and a bit based on the read information. A determination step that determines whether or not editing of the stream is possible, and an editing step that edits the bitstream when it is determined that the bitstream can be edited in the processing of the determination step, In the processing of the above, the characteristic curve created by the information included in the header of the first bitstream is always above the characteristic curve created by the information included in the header of the second bitstream, or If they are the same, editing using the first bitstream and the second bitstream is possible. Characterized by determining that that.
[0055]
The program of the third recording medium of the present invention reads the buffer step information for searching the header in the input bitstream, and the buffer characteristic information regarding the buffer included in the header searched by the processing of the search step, and reads the read A determination step for determining whether the bitstream can be edited based on the information; an editing step for editing the bitstream if it is determined that the bitstream can be edited in the processing of the determination step; And the processing of the determining step is such that the characteristic curve created by the information contained in the header of the first bitstream is always above the characteristic curve created by the information contained in the header of the second bitstream. Or if they are identical, the first bitstream and the second bitstream Characterized in that it is determined that the editing had is possible.
[0056]
A third program of the present invention reads a header in an input bitstream, and reads buffer characteristic information relating to a buffer included in the header searched by the processing of the search step, and based on the read information A determination step for determining whether or not the bitstream can be edited, and an editing step for editing the bitstream if it is determined in the determination step that the bitstream can be edited In the determination step, the characteristic curve created by the information contained in the header of the first bitstream is always the characteristic curve created by the information contained in the header of the second bitstream. The first bitstream and the second if they are located on or are identical Edited with Tsu preparative stream is characterized in that determines that it is possible to.
[0057]
In the encoding apparatus and method of the present invention and the first program, the header encoded by the bit stream and multiplexed includes information on the buffer characteristics related to the buffer when the bit stream is decoded.
[0058]
In the decoding apparatus and method and the second program of the present invention, the buffer characteristic information related to the buffer at the time of decoding included in the header of the input bit stream is read, and decoding is performed based on the read information. Is done.
[0059]
In the editing apparatus and method of the present invention, and the third program, the determination whether or not the input bitstream can be edited is created based on the information included in the header of the first bitstream This is done by determining whether the curve is always above or identical to the characteristic curve created by the information contained in the header of the second bitstream.
[0060]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings. FIG. 5 is a diagram showing a configuration of an embodiment of an encoding apparatus to which the present invention is applied. The encoding device 70 shown in FIG. 5 is configured to include the image information encoding device 10 shown in FIG. Here, since the configuration of the image information encoding apparatus 10 has already been described, the description thereof will be omitted as appropriate.
[0061]
The image information input to the image information encoding device 10 is encoded and output to the buffer 71 and the bitstream analysis unit 72 as image compression information (BS: bitstream). The buffer 71 temporarily stores the input bit stream and outputs it to the buffer information adding unit 73 as necessary. The bit stream analysis unit 72 checks the buffer occupancy state between predetermined intervals in the bit stream, for example, GOPs or random access points, and supplies the information to the buffer information addition unit 73 as buffer information BH. Here, the random access point refers to a predetermined section in the JVT standard that can be accessed randomly in the bitstream. Similarly, the GOP refers to a predetermined section that can be accessed randomly in the MPEG2 / MPEG4 standard.
[0062]
The buffer information adding unit 73 adds the input buffer information BH to the input bit stream and outputs the same.
[0063]
Here, as an example of analysis performed by the bitstream analysis unit 72, a buffer occupancy state is checked between each random access point, and the bit occupancy state information is encoded as header information in each random access point to form a bitstream Will be described as an example. Here, such a description will be given, but it may be calculated in GOP units, or may be determined in other arbitrary units. In the case where other units are used as the units described below, However, it goes without saying that the present invention can be applied.
[0064]
With reference to FIG._min, B_min) Will be described. Where R_minIndicates the minimum value of the input bit rate R to the buffer, and B_minIs the minimum buffer size B.
[0065]
When a bit rate R of a predetermined bit stream is given, a minimum buffer size B that can be decoded by a decoding device (for example, having the configuration shown in FIG. 7) that decodes the bit stream at the decoding bit rate R_minIs determined as follows, for example.
[0066]
Let N be the number of frames between predetermined access points. The generated bit amount of each frame is b (i) (i = 1, N), the buffer occupancy immediately before extracting data of each frame from the buffer is B (i), and the buffer occupancy immediately after extraction is B2 (i). And If the buffer amount of the encoding device is B,
B2 (i) = B (i) -b (i)
B (i + 1) = B2 (i) + R / (Frame Rate) (5)
However, if (B (i + 1)> B) B (i + 1) = B, and the maximum value of B (i) is B. The delay amount F is F = B.
[0067]
At this time, B_minIs obtained by the following equation (6).
B_min= B-min (B2 (i)) (6)
R at this time is R_minIf the above method (R_min, B_min) Can be determined.
[0068]
Next, (R_min, B_min, F_min) Will be described. B = B_min, R = R_minAnd Similar to Expression (5), the following Expression (7) is established.
B2 (i) = B (i) -b (i)
B (i + 1) = B2 (i) + R / (Frame Rate) (7)
It becomes. However, underflow is monitored based on the following conditions.

[0069]
F_minIs initialized to 0 at the beginning of each random access point. Similarly, monitoring for overflow is performed based on the following conditions.
if (B (i + 1)> B) B (i + 1) = B
By performing the above check on all frames between random access points, (R_min, B_min, F_min) Is determined.
[0070]
(R_min, B_min, F_min) May be inspected by a predetermined number, or only independent combinations may be defined. The characteristics obtained as described above are as shown in FIG. Linear interpolation is performed between the points. (R) obtained as described above._min, B_min, F_min) And buffer information BH are inserted into a predetermined position in the bitstream by the buffer information adding unit 73, encoded and output.
[0071]
The bit stream analysis unit 72 (R) between each random access as described above._min, B_min, F_min) At the same time, the same analysis is performed on the entire bitstream, and the characteristics for the entire bitstream are expressed as_min, B_min, F_min) Global is determined, and this value is supplied to the buffer information adding unit 73 as buffer information BH.
[0072]
The bit stream BS output from the image information encoding device 10 is delayed by a predetermined time in the buffer 71 and then input to the buffer information adding unit 73. The buffer information adding unit 73 inserts the buffer information BH supplied from the bit stream analyzing unit 72 at a predetermined position in the bit stream, and outputs the final output bit stream BS.
[0073]
Here, the buffer information BH (or buffer characteristic information) is, for example, (R_mi _n, B_min, F_min) And (R_min, B_min, F_min) Global. The buffer information adding unit 73 inserts the information at a predetermined position in the bit stream BS. Here, an example of syntax is shown and described below.
[0074]

[0075]
(R between random access points_min, B_min, F_min) Is recorded in the random access point header immediately before, for example, like the syntax described above. RAP_startcode is a code indicating the presence of a RAP header and the start of the header.
[0076]
The closed_GOP is a flag indicating whether all the pictures in the GOP are independent without referring to a picture of another GOP, or whether there is a dependency relationship referring to a picture of another GOP. The “broken_link” is a flag indicating whether or not a prediction reference image exists when a bitstream is replaced before and after the GOP by editing or the like.
[0077]
NumBuffer_Param is the property set (R_min, B_min, F_min). Rate [i], Buffer [i], F [i]_min, B_min, F_minShow. Here, for example, R_minAre recorded in order from the smallest.
[0078]
(R for the entire bitstream_min, B_min, F_min) Global is recorded in the sequence header at the head of the bitstream, for example, with the following syntax.
[0079]

[0080]
Here, NumBuffer_Param is the obtained characteristic set (R_min, B_min, F_min) Indicates the number of globals. Rate [i], Buffer [i], F [i]_min, B_min, F_minIndicates. Here, for example, R_minAre recorded in order from the smallest.
[0081]
In the buffer information adding unit 73, after the buffer information BH is added, the final output bit stream BS is output.
[0082]
In the embodiment of the invention, the buffer information BH is described so that all of the minimum bit rate Rmin, the minimum buffer size Bmin, and the minimum delay amount Fmin are added to the bitstream. However, the present invention is not limited to this example, and at least one of the minimum bit rate Rmin, the minimum buffer size Bmin, and the minimum delay amount Fmin may be added to the bitstream. For example, a combination of the minimum bit rate Rmin and the minimum buffer size Bmin may be added to the bitstream.
[0083]
FIG. 7 shows the configuration of an embodiment of a decoding apparatus to which the present invention is applied. The decoding device 90 shown in FIG. 7 corresponds to the encoding device 70 shown in FIG. 5, and includes the image information decoding device 40 shown in FIG. 2 inside. The bit stream BS input to the decoding device 90 is supplied to the bit stream analysis unit 91 and the decoding possibility determination unit 92.
[0084]
The bit stream analysis unit 91 decodes the buffer information BH in the bit stream and outputs it to the decoding possibility determination unit 92. The bitstream analysis unit 91 parses the bitstream and records it in the sequence header (R_min, B_min, F_min) Decode global. Also, (R) recorded in each random access point header._min, B_min, F_min). These pieces of information are output to the decoding possibility determination unit 92.
[0085]
Based on the buffer information BH and the decoder information DI supplied from the image information decoding device 40, the decodability determination unit 92 determines whether the input bitstream can be decoded without causing the buffer to fail. The decoder information DI is, for example, a decoder buffer size and a decoding bit rate.
[0086]
The decryptability determination unit 92 (R_min, B_min, F_min) A characteristic curve as shown in FIG. 4 is created from global. Linear interpolation is performed between the points. At this time, the buffer and decoding bit rate of the decoder (decoding device 90) are (R_min, B_min, F_min) If it is located above the characteristic curve created by global, it can be determined that the input bitstream is decodable. Therefore, in such a case, the decoding possibility determination unit 92 determines that decoding is possible and supplies the bit stream to the image information decoding device 40.
[0087]
The image information decoding apparatus 40 has a configuration basically similar to that of the image information decoding apparatus 40 shown in FIG. 2, executes similar processing, decodes the input bit stream, and displays image information on a television not shown. Output to John receiver.
[0088]
Whether or not the entire bitstream can be decoded is determined by (R_min, B_min, F_min) It can be determined by examining the global characteristic curve, decoder buffer size, and decoding bit rate.
[0089]
Further, when it is desired to decode only a specific section from a predetermined random access point by random access or the like, the decoding possibility determination unit 92 similarly (R_mi _n, B_min, F_min) To create a characteristic curve as shown in FIG. Linear interpolation is performed between the points. At this time, the decoder buffer and the decoding bit rate are (R_min, B_m _in, F_minThe bitstream is decodable if it lies above the characteristic curve created by Therefore, in such a case, the decoding possibility determination unit 92 determines that decoding is possible and supplies the bit stream to the image information decoding device 40.
[0090]
Next, a description will be given of editing a bitstream. FIG. 8 is a diagram showing a configuration of an embodiment of an editing apparatus 110 that edits a bitstream to which the present invention is applied. As an example of editing performed by the editing apparatus 110, a case where a splice is performed in which a part of the input bitstream 1 is replaced with another input bitstream 2 will be described as an example.
[0091]
Here, the splice will be briefly described. The splice is to perform editing by replacing a predetermined bit stream with another bit stream at a random access point. Such a splice is, for example, when a commercial broadcast is inserted into a television broadcast program. In this case, the input bit stream 1 is a television broadcast program bit stream, and the input bit stream 2 is a commercial bit stream.
[0092]
The input bitstream 1 is input to the bitstream analysis unit 111-1, and the input bitstream 2 is input to the bitstream analysis unit 111-2. The bit stream analysis units 111-1 and 111-2 decode the

buffer information BH

1 and 2 included in the input bit streams 1 and 2, respectively, and output them to the bit stream editing unit 112.
[0093]
Based on the buffer information BH1 and 2, the bitstream editing unit 112 determines whether or not the input bitstream 2 can be inserted into the input bitstream 1 at a predetermined editing point. At this time, in order for the edited bit stream to be decodable without damaging the buffer of the decoder (decoding device 90), the condition is that the random access point and the buffer occupancy value just before it are the same is necessary.
[0094]
Decoders using the MPEG2 and 4 schemes were supposed to operate at specific bit rates and buffer sizes. However, decoders using the JVT scheme have other bit rates and buffer sizes as shown in FIG. (R_min, B_min, F_min), The constraints on the buffer are relaxed so that it can be decoded.
[0095]
In order to prevent the possibility of decoding from changing before and after editing by editing the bitstream, (R_min, B_min, F_min) Are the same. Therefore, the bitstream editing unit 112 (R) in the random access point header located in the editing section._min, B_min, F_min) A characteristic is created for the

input bitstreams

1 and 2, and when these values match, the section is replaced with the bitstream 2. If they do not match, padding bits are inserted into

bitstream

1 or 2 and (R_min, B_min, F_min) Are matched, and then replaced with the input bitstream 2.
[0096]
In JVT, restrictions on the buffer are relaxed, but if this is utilized, it becomes possible to relax the adaptability condition of the buffer in the splice. In JVT, the buffer size and decoding bit rate of the decoder are (R_min, B_min, F_min), It can be seen that decoding is possible. Therefore, (R) of a predetermined editing section of the original input bitstream 1_min, B_min, F_min) In the predetermined editing section of the input bitstream 2 to be inserted (R)_min, B_min, F_min) Is always below, a decoder capable of decoding the input bitstream 1 can be decoded even if the section is replaced with the bitstream 2.
[0097]
FIG. 9 illustrates the relationship. Curve 1 is (R) in the editing section of input bitstream 1._min, B_min, F_min) Show characteristics. Curve 2 is (R in the editing section of input bitstream 2._min, B_min, F_min) Show characteristics. Since the decoder's buffer, the decoding bit rate, can be decoded if it is above this curve, it is guaranteed that it can be decoded when curve 2 is always below curve 1, as shown in FIG. The
[0098]
Therefore, the bitstream editing unit 112 (R) in the random access point header located in the editing section._min, B_min, F_min) Create a characteristic for the

bitstreams

1 and 2, and if the characteristic curve of the bitstream 2 comes under the characteristic curve of the bitstream 1, replace that section with the bitstream 2.
[0099]
Conversely, if they do not match, padding bits are inserted into

bitstream

1 or 2, and (R_min, B_min, F_min) After the characteristic curve is changed to be located below the characteristic curve of the bit stream 1, the input bit stream 2 is replaced.
[0100]
When splicing is performed so as to satisfy such a condition, a decoder capable of decoding the bitstream 1 is not broken. After performing the splicing, the bit stream editing unit 112 outputs a final bit stream.
[0101]
Thus, (R) is added to the header of the point where random access in the bitstream can be performed._min, B_min, F_min) Including the minimum bit rate, minimum buffer size, minimum initial delay time, and the like, the decoding side can efficiently determine the decoding possibility of the bit stream, and the bit stream such as splicing It is possible to make it possible to always perform decoding without compromising the buffer on the decoding side.
[0102]
FIG. 10 is a diagram illustrating an internal configuration example of a general-purpose personal computer. A CPU (Central Processing Unit) 211 of the personal computer executes various processes according to a program stored in a ROM (Read Only Memory) 212. A RAM (Random Access Memory) 213 appropriately stores data and programs necessary for the CPU 211 to execute various processes. The input / output interface 215 is connected to an input unit 216 including a keyboard and a mouse, and outputs a signal input to the input unit 216 to the CPU 211. The input / output interface 215 is also connected to an output unit 7 including a display and a speaker.
[0103]
Further, a storage unit 218 constituted by a hard disk or the like, and a communication unit 219 for exchanging data with other devices via a network such as the Internet are connected to the input / output interface 215. The drive 220 is used when data is read from or written to a recording medium such as the magnetic disk 231, the optical disk 232, the magneto-optical disk 233, and the semiconductor memory 234.
[0104]
As shown in FIG. 10, the recording medium is distributed to provide a program to the user separately from the personal computer, and a magnetic disk 231 (including a flexible disk) on which the program is recorded, an optical disk 232 (CD- It is composed of ROM (Compact Disc-Read Only Memory), DVD (including Digital Versatile Disc), magneto-optical disk 233 (including MD (Mini-Disc) (registered trademark)) or semiconductor memory 234. In addition, it is configured by a hard disk including a ROM 212 storing a program and a storage unit 218 provided to the user in a state of being pre-installed in a computer.
[0105]
In this specification, the steps for describing the program provided by the medium are performed in parallel or individually in accordance with the described order, as well as the processing performed in time series, not necessarily in time series. The process to be executed is also included.
[0106]
Further, in this specification, the system represents the entire apparatus constituted by a plurality of apparatuses.
[0107]
【The invention's effect】
As described above, according to the encoding apparatus and method and the first program of the present invention, information on the buffer characteristics related to the buffer used when decoding the bit stream is added to the header encoded and multiplexed into the bit stream. Since it is included, it is possible to prevent the buffer from failing on the decoding side.
[0108]
Further, according to the decoding apparatus and method and the second program of the present invention, the buffer characteristic information relating to the buffer at the time of decoding included in the header of the input bit stream is read, and the read information Therefore, it is possible to prevent the buffer from failing at the time of decoding.
[0109]
Further, according to the editing apparatus and method and the third program of the present invention, the determination as to whether or not the input bitstream can be edited is created based on the information included in the header of the first bitstream. Since it is determined whether the characteristic curve to be generated is always above or identical to the characteristic curve created by the information included in the header of the second bitstream, splicing, etc. Therefore, it is possible to reduce the processing involved in editing and to determine whether editing is easy.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a configuration of an example of a conventional image information encoding device.
FIG. 2 is a diagram illustrating a configuration of an example of a conventional image information decoding device.
FIG. 3 is a diagram illustrating a buffer amount.
FIG. 4 is a diagram illustrating a relationship between a bit rate and a buffer amount.
FIG. 5 is a diagram showing a configuration of an embodiment of an encoding apparatus to which the present invention is applied.
FIG. 6 is a diagram illustrating a buffer amount.
FIG. 7 is a diagram illustrating the configuration of an embodiment of a decoding device to which the present invention has been applied.
FIG. 8 is a diagram showing a configuration of an embodiment of an editing apparatus to which the present invention is applied.
FIG. 9 is a diagram illustrating a relationship between a bit rate and a buffer amount.
FIG. 10 is a diagram illustrating a medium.
[Explanation of symbols]
70 encoding device, 71 buffer, 72 bitstream analysis unit, 73 buffer information addition unit, 90 decoding device, 91 bitstream analysis unit, 92 decodability determination unit, 111 bitstream analysis unit, 112 bitstream editing unit

Claims

Generating means for generating a header to be referred to as necessary at the time of decoding;
Encoding means for encoding each of the header generated by the generation means and the input image signal;
Output means for multiplexing the header and the image signal encoded by the encoding means, and outputting a bit stream;
The encoding device generates the header including buffer characteristic information related to a buffer when decoding the bitstream.

The encoding apparatus according to claim 1, wherein the generation unit generates the header including the buffer characteristic information for each predetermined section that can be randomly accessed in the bitstream.

The encoding apparatus according to claim 1, wherein the generation unit generates the header including the buffer characteristic information of the entire sequence of the bitstream.

The encoding apparatus according to claim 1, wherein the buffer characteristic information includes all of a minimum decodable bit rate Rmin, a minimum buffer size Bmin, and a minimum delay amount Fmin when decoding the bitstream. .

The buffer characteristic information includes at least one of a minimum decodable bit rate Rmin, a minimum buffer size Bmin, and a minimum delay amount Fmin when decoding the bitstream. The encoding device described.

A generation step for generating a header to be referred to as necessary at the time of decoding;
An encoding step for encoding the header generated by the generation step and an input image signal;
An output step of multiplexing the header encoded by the encoding step and the image signal and outputting a bit stream;
The encoding method according to claim 1, wherein the generating step generates the header including buffer characteristic information related to a buffer when the bitstream is decoded.

A generation step for generating a header to be referred to as necessary at the time of decoding;
An encoding step for encoding the header generated by the generation step and an input image signal;
An output step of multiplexing the header encoded by the encoding step and the image signal and outputting a bit stream;
A recording medium on which a computer-readable program is recorded, wherein the generation step generates the header including buffer characteristic information relating to a buffer when the bitstream is decoded.

A generation step for generating a header to be referred to as necessary at the time of decoding;
An encoding step for encoding the header generated by the generation step and an input image signal;
Causing the computer to execute processing including the header encoded by the encoding step and the output step of multiplexing the image signal and outputting a bitstream;
The processing of the generating step generates the header including buffer characteristic information related to a buffer when the bit stream is decoded.

Search means for searching for headers in the input bitstream;
A decoding apparatus comprising: decoding means for reading buffer characteristic information relating to a buffer included in the header searched by the search means, and decoding the bitstream based on the read buffer characteristic information.

The decoding apparatus according to claim 9, wherein the buffer characteristic information is added to the header for each predetermined section that can be randomly accessed in the bitstream.

The decoding apparatus according to claim 9, wherein the buffer characteristic information relating to the entire sequence of the bitstream is added to the header.

The decoding apparatus according to claim 9, wherein the buffer characteristic information includes all of a minimum decodable bit rate Rmin, a minimum buffer size Bmin, and a minimum delay amount Fmin when decoding the bitstream. .

The buffer characteristic information includes at least one of a minimum bit rate Rmin, a minimum buffer size Bmin, and a minimum delay amount Fmin that can be decoded when the bitstream is decoded. The decoding device described.

The decoding means creates a buffer characteristic curve from the information read from the bitstream, and when the decoding apparatus characteristic curve is located above the bitstream characteristic curve, the input bitstream can be decoded. The decoding apparatus according to claim 9, further comprising a determination unit that determines that there is one.

A search step for searching for a header in the input bitstream;
A decoding method, comprising: a decoding step of reading buffer characteristic information relating to a buffer included in the header searched by the processing of the searching step, and decoding the bitstream based on the read buffer characteristic information.

A search step for searching for a header in the input bitstream;
And a decoding step of reading buffer characteristic information relating to the buffer included in the header searched by the processing of the searching step, and decoding the bitstream based on the read buffer characteristic. Media on which various programs are recorded.

A search step for searching for a header in the input bitstream;
Reading out buffer characteristic information relating to a buffer included in the header searched by the process of the searching step, and causing the computer to execute a process including a decoding step of decoding the bitstream based on the read buffer characteristic. Program.

Search means for searching for headers in the input bitstream;
Determination means for reading buffer characteristic information regarding the buffer included in the header searched by the search means, and determining whether or not the bitstream can be edited based on the read information;
Editing means for editing the bitstream if the determination means determines that the bitstream can be edited,
The determination means is characterized in that the characteristic curve created by the information contained in the header of the first bitstream is always above the characteristic curve produced by the information contained in the header of the second bitstream. Or the same, the editing apparatus determines that editing using the first bit stream and the second bit stream is possible.

A search step for searching for a header in the input bitstream;
A determination step of reading buffer characteristic information relating to the buffer included in the header searched by the search step processing, and determining whether the bitstream can be edited based on the read information;
An editing step of editing the bitstream if it is determined that the bitstream can be edited in the process of the determining step,
In the process of the determining step, the characteristic curve created by the information included in the header of the first bitstream is the characteristic curve created by the information included in the header of the second bitstream. An editing method characterized by determining that editing using the first bit stream and the second bit stream is possible when the position is always above or the same.