JP2014112748A

JP2014112748A - Image coding device and image decoding device

Info

Publication number: JP2014112748A
Application number: JP2011060979A
Authority: JP
Inventors: Sumio Sato; 純生佐藤
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2011-03-18
Filing date: 2011-03-18
Publication date: 2014-06-19
Also published as: WO2012128209A1

Abstract

【課題】距離画像を符号化・復号する符号化装置および復号装置を提供する。
【解決手段】符号化装置は、入力された距離画像を所定サイズの矩形のブロックに分割する分割手段と、分割手段により分割された符号化対象ブロック周囲の符号化済みブロックを構成する画素群を所定の複写形式に基づき複写することにより、符号化対象ブロックを近似する複写近似手段と、分割手段により分割された符号化対象ブロックを所定の描画形式を用いることによって符号化対象ブロックを近似するとともに、用いた描画形式の深度値の情報を蓄積する描画形式近似手段と、複写近似手段と、描画形式近似手段のいずれかを選択する選択手段と、符号化対象ブロックに対して選択した複写形式または描画形式の形式識別情報と蓄積した深度値の情報に基づいて生成した符号語を伝送する符号語生成手段とを備えた。復号装置は、符号化装置において用いられた符号化方法に基づき復号する。
【選択図】図１An encoding device and a decoding device for encoding / decoding a distance image are provided.
An encoding apparatus includes: a dividing unit that divides an input distance image into rectangular blocks of a predetermined size; and a group of pixels that form an encoded block around an encoding target block divided by the dividing unit. The copy approximation means for approximating the block to be encoded by copying based on a predetermined copy format, and the block to be encoded is approximated by using the predetermined drawing format for the block to be encoded divided by the dividing means. A drawing format approximating means for storing depth value information of the used drawing format, a copy approximating means, a selecting means for selecting one of the drawing format approximating means, and a copy format selected for the encoding target block or Code word generation means for transmitting a code word generated based on the format identification information of the drawing format and the accumulated depth value information is provided. The decoding device performs decoding based on the encoding method used in the encoding device.
[Selection] Figure 1

Description

本発明は、画像符号化装置および画像復号装置に関する。 The present invention relates to an image encoding device and an image decoding device.

被写体の三次元形状を、正確に、且つ、効率良く記録することは重要なテーマであり、従来からさまざまな方法が提案されている。その方法の一つとして、被写空間を各被写体および背景の色で表現した一般的な二次元画像であるテクスチャ画像と、被写空間を各被写体および背景までの視点からの距離で表現した画像（以下、「距離画像」と呼ぶ）との二種類の画像データを関連付けて記録する方法がある。距離画像とは、画素ごとに、被写空間中の対応する地点までの視点からの距離値（深度値）を表現する画像である。この距離画像は、例えば、テクスチャ画像を記録するカメラ近傍に設置された、デプスカメラ等の測距装置によって取得できる。あるいは、多視点カメラの撮影によって得られる複数のテクスチャ画像を解析することによっても距離画像を取得することができ、その解析手法も数多く提案されている。 Accurate and efficient recording of the three-dimensional shape of the subject is an important theme, and various methods have been proposed. As one of the methods, a texture image that is a general two-dimensional image that represents the subject space with the color of each subject and the background, and an image that represents the subject space with the distance from the viewpoint to each subject and the background. There is a method of recording in association with two types of image data (hereinafter referred to as “distance image”). A distance image is an image that expresses a distance value (depth value) from a viewpoint to a corresponding point in a subject space for each pixel. This distance image can be acquired, for example, by a distance measuring device such as a depth camera installed in the vicinity of the camera that records the texture image. Alternatively, a distance image can be acquired by analyzing a plurality of texture images obtained by photographing with a multi-viewpoint camera, and many analysis methods have been proposed.

また、距離画像に関する規格として、国際標準化機構／国際電機標準会議（ＩＳＯ／ＩＥＣ）のワーキンググループであるMoving Picture Experts Group（ＭＰＥＧ）において、距離値を２５６段階（８ビットの輝度値）で表現する規格であるＭＰＥＧ−Ｃｐａｒｔ３が定められており、標準的な距離画像は８ビットのグレースケール画像となる。また、視点からの距離が近いほど高い輝度値を割り当てるように規定されているため、標準的な距離画像では、手前に位置する被写体ほど白く、奥に位置する被写体ほど黒く表現される。距離画像の特徴として、テクスチャ画像と比べてより広い領域において単一の画素値が表れる傾向が強いと言える。例えば、テクスチャ画像に派手な柄の服を着ている人物が描かれていても、距離画像においては、服の部分の距離値がほぼ一定になる。 In addition, distance values are expressed in 256 levels (8-bit luminance values) in the Moving Picture Experts Group (MPEG), which is a working group of the International Organization for Standardization / ISO / IEC, as a standard for distance images. The standard MPEG-C part3 is defined, and the standard distance image is an 8-bit grayscale image. In addition, since it is defined that a higher luminance value is assigned as the distance from the viewpoint is shorter, in a standard distance image, a subject located in front is expressed as white and a subject located in the back is expressed in black. As a feature of the distance image, it can be said that a single pixel value tends to appear in a wider area than the texture image. For example, even if a person wearing a fancy pattern is drawn on the texture image, the distance value of the clothes portion is almost constant in the distance image.

同一の被写空間を表現したテクスチャ画像と距離画像とが得られれば、テクスチャ画像に描画されている被写体像を構成する各画素の視点からの距離が距離画像から分かるため、被写体を奥行きが最大２５６段階で表現される三次元形状として復元することができる。さらに、三次元形状を二次元平面上に幾何的に投影することにより、元のテクスチャ画像を、元の角度から一定範囲にある別の角度から被写体を撮影した場合の被写空間のテクスチャ画像に変換することが可能である。すなわち、１組のテクスチャ画像および距離画像によって一定範囲にある任意の角度から見たときの三次元形状を復元できるため、複数組のテクスチャ画像および距離画像を用いることにより三次元形状の自由視点画像を少ないデータ量で表すことが可能である。 If a texture image and a distance image representing the same subject space are obtained, the distance from the viewpoint of each pixel constituting the subject image drawn in the texture image is known from the distance image, so that the subject has the maximum depth. It can be restored as a three-dimensional shape expressed in 256 stages. Furthermore, by projecting the 3D shape onto the 2D plane geometrically, the original texture image is converted into a texture image in the subject space when the subject is photographed from another angle within a certain range from the original angle. It is possible to convert. That is, since a 3D shape can be restored when viewed from an arbitrary angle within a certain range by a set of texture images and distance images, a free viewpoint image of 3D shapes can be obtained by using multiple sets of texture images and distance images. Can be expressed with a small amount of data.

ところで、動画圧縮規格であるＨ．２６４のように、映像が内部に持つ時間的あるいは空間的な冗長性を効率良く排除することにより、映像を圧縮符号化する技術が知られている（例えば、非特許文献１）。この技術を用いた符号化装置により、テクスチャ映像（テクスチャ画像を各フレームとする映像）と距離映像（距離画像を各フレームとする映像）との各映像を符号化すると、各映像が有する冗長性を排除することが可能となり、復号装置に伝送される各映像のデータ量をさらに削減することができる。 By the way, the video compression standard H.264. As in the case of H.264, a technique for compressing and encoding video by efficiently eliminating temporal or spatial redundancy in the video is known (for example, Non-Patent Document 1). When each video of a texture video (video having a texture image as each frame) and a distance video (video having a distance image as each frame) is encoded by an encoding device using this technology, the redundancy that each video has Can be eliminated, and the data amount of each video transmitted to the decoding device can be further reduced.

このＨ．２６４規格では、画像の変換方式に、整数精度ＤＣＴ変換と、アダマール変換との、２つの変換方式が採用されている。これらはともに、直交変換方式である。整数精度ＤＣＴ変換は、実数精度ＤＣＴ変換（通常のＤＣＴ）の近似であり、それと比べて演算量が少ないことを特徴とする。アダマール変換は、整数精度ＤＣＴよりもさらに演算量が少なく、直流成分だけを集めて生成したブロック（ＤＣブロック）の変換に用いられる。この直交変換は、ブロック内の相関を算出することに用いられるが、Ｈ．２６４規格では、最大で１６×１６の画素ブロックに対して用いる。すなわち、１６×１６の画素ブロック内の画素間の相関は情報圧縮に利用される。これらの方式は、自然画の圧縮において、適切なビットレートの範囲内では極めて効率的に情報を圧縮できるが、極端にビットレートが低くなると、画像が全体的にぼやけ、ブロックノイズが現れるという特徴を有している。 This H. In the H.264 standard, two conversion methods, an integer precision DCT conversion and a Hadamard conversion, are adopted as image conversion methods. Both of these are orthogonal transform methods. The integer precision DCT transform is an approximation of the real number precision DCT transform (ordinary DCT), and has a feature that the amount of calculation is smaller than that. The Hadamard transform has a smaller amount of computation than the integer precision DCT, and is used for transforming a block (DC block) generated by collecting only DC components. This orthogonal transform is used to calculate the correlation within the block. In the H.264 standard, it is used for a maximum of 16 × 16 pixel blocks. That is, the correlation between pixels in a 16 × 16 pixel block is used for information compression. These methods can compress information very efficiently within an appropriate bit rate range when compressing natural images. However, when the bit rate is extremely low, the image is blurred as a whole and block noise appears. have.

また、Ｈ．２６４規格は、より情報を圧縮するために、画面内予測符号化という方式も採用している。これは、符号化対象ブロックに隣接する符号化済み画素などを使用して、符号化対象ブロックの画素値を予測するものである。符号化対象ブロックでは、その予測値との差分を直交変換することによって情報圧縮を行う。上述の直交変換は最大で１６×１６画素のブロック内の相関を利用するに留まったが、この画面内予測符号化という方式を併用することで、隣接画素との相関も利用した圧縮を行うことができる。 H. The H.264 standard also employs a method called intra prediction encoding in order to further compress information. In this method, the pixel value of the encoding target block is predicted using an encoded pixel adjacent to the encoding target block. In the encoding target block, information compression is performed by orthogonally transforming the difference from the predicted value. The orthogonal transformation described above only uses the correlation within the block of 16 × 16 pixels at the maximum, but by using this method of predictive coding within the screen, the compression using the correlation with the adjacent pixels is performed. Can do.

「ＩＴＵ−Ｔ勧告Ｈ．２６４」，International Telecommunication Union - Telecommunication Standardization Sector，２００９年３月“ITU-T Recommendation H.264”, International Telecommunication Union-Telecommunication Standardization Sector, March 2009

しかしながら、Ｈ．２６４規格で規格されている圧縮符号化技術を距離映像に適応したとき、極端にビットレートが低い環境下において、上述したように、ぼやけやブロックノイズが現れる。これは、距離映像の符号化のビットレートを低下させていくと、整数精度ＤＣＴ変換やアダマール変換などの直交変換した変換係数に割り当てるビット数が少なくなっていくことにより量子化歪みが増大し、ブロック内の全ての画素が直流成分の値のみとなってしまうためである。 However, H.C. When the compression coding technique standardized in the H.264 standard is applied to distance images, blur and block noise appear as described above in an environment where the bit rate is extremely low. This is because if the bit rate of encoding distance video is lowered, the number of bits allocated to transform coefficients obtained by orthogonal transform such as integer precision DCT transform and Hadamard transform is reduced, thereby increasing quantization distortion. This is because all the pixels in the block have only a DC component value.

距離映像は通常、復号後に、テクスチャ映像を撮影した視点とは別の視点の映像を生成するために用いられるが、その際、ぼやけやブロックノイズは合成映像の品質を劣化させる大きな要因となる。その理由は、距離映像において、被写体の輪郭部分の位置・連続性が、合成画像の品質に対して非常に重要であるからである。テクスチャ画像の被写体の輪郭が連続であるにも関わらず、それに対応する距離画像の輪郭が不連続である場合、合成されたテクスチャ画像の被写体の輪郭も不連続となってしまう。すなわち、Ｈ．２６４規格は自然画から成る映像を、ＰＳＮＲ(Peak Signal-to-Noise Ratio)などの客観的尺度を指標とし、符号化するために極めて効率的な方式であるが、距離映像のように、任意視点の映像を合成するためだけに用いられる特殊な映像に対しては、効率的な方式であるとは言えない。同じＰＳＮＲでも、特に被写体の輪郭部分が、それに対応するテクスチャ映像と一致している方が、合成映像の品質が一般的に高くなる。 The distance video is usually used to generate a video of a viewpoint different from the viewpoint from which the texture video was shot after decoding, and blur and block noise are major factors that degrade the quality of the synthesized video. This is because the position / continuity of the contour portion of the subject is very important for the quality of the composite image in the distance video. When the contour of the subject of the texture image is continuous but the contour of the corresponding distance image is discontinuous, the contour of the subject of the synthesized texture image is also discontinuous. That is, H.I. The H.264 standard is a very efficient method for encoding video composed of natural images using an objective measure such as PSNR (Peak Signal-to-Noise Ratio) as an index. It cannot be said that it is an efficient method for special images that are used only to synthesize viewpoint images. Even when the PSNR is the same, the quality of the synthesized video is generally higher especially when the contour portion of the subject matches the texture video corresponding thereto.

本発明は、このような事情に鑑みてなされたもので、距離画像の符号化データの符号量を従来よりも削減することができる画像符号化装置およびこの画像符号化装置から供給された符号化データから距離画像を復号する画像復号装置を提供することを目的とする。 The present invention has been made in view of such circumstances, and an image encoding device capable of reducing the amount of encoded data of a distance image as compared with the conventional one and the encoding supplied from the image encoding device. An object of the present invention is to provide an image decoding apparatus that decodes a distance image from data.

本発明は、距離画像を符号化する画像符号化装置であって、入力された距離画像を所定サイズの矩形のブロックに分割する分割手段と、前記分割手段により分割された符号化対象ブロック周囲の符号化済みブロックを構成する画素群を所定の複写形式に基づき複写することにより、前記符号化対象ブロックを近似する複写近似手段と、前記分割手段により分割された符号化対象ブロックを所定の描画形式を用いることによって前記符号化対象ブロックを近似するとともに、用いた前記描画形式の深度値の情報を蓄積する描画形式近似手段と、前記複写近似手段と、前記描画形式近似手段のいずれかを選択する選択手段と、前記符号化対象ブロックに対して選択した複写形式または描画形式の形式識別情報と蓄積した前記深度値の情報に基づいて生成した符号語を伝送する符号語生成手段とを備えたことを特徴とする。 The present invention is an image encoding apparatus that encodes a distance image, a dividing unit that divides an inputted distance image into rectangular blocks of a predetermined size, and a surrounding area of an encoding target block divided by the dividing unit. A copy approximating unit for approximating the encoding target block by copying a pixel group constituting the encoded block based on a predetermined copying format, and a coding target block divided by the dividing unit in a predetermined drawing format Is used to approximate the encoding target block and to select one of drawing format approximating means for storing depth value information of the used drawing format, copy approximating means, and drawing format approximating means. Based on selection means, format identification information of the copy format or drawing format selected for the block to be encoded, and information on the accumulated depth value Characterized in that a codeword generating means for transmitting the generated codeword.

本発明は、前記分割手段により分割されたブロックの深度値を量子化する深度量子化手段をさらに備えたことを特徴とする。 The present invention is characterized by further comprising depth quantization means for quantizing the depth value of the block divided by the dividing means.

本発明は、前記描画形式は、２つの深度値を含み、深度値の境界のみを規定することを特徴とする。 The present invention is characterized in that the drawing format includes two depth values and defines only the boundary of the depth values.

本発明は、前記選択手段は、複写形式の中から１つ、あるいは描画形式の中から１つ、あるいは複写形式１つと描画形式１つの組み合わせを１つのいずれかを選択することを特徴とする。 The present invention is characterized in that the selection means selects one of the copy formats, one of the drawing formats, or one combination of one copy format and one drawing format.

本発明は、前記２つのうち１つの深度値を、符号化対象ブロック周囲の符号化済みブロックを構成する画素群から決定することを特徴とする。 The present invention is characterized in that one of the two depth values is determined from a pixel group constituting an encoded block around the encoding target block.

本発明は、前記符号化対象ブロック周囲の符号化済みブロックを構成する画素群から決定する深度値は、予め描画形式ごとに規定する画素位置から決定することを特徴とする。 The present invention is characterized in that a depth value determined from a pixel group constituting an encoded block around the encoding target block is determined in advance from a pixel position defined for each drawing format.

本発明は、前記符号化対象ブロック周囲の符号化済みブロックを構成する画素群から決定する深度値を、描画形式に含まれる２つの領域のいずれかに適用するかについて、各描画形式ごとに予め規定することを特徴とする。 The present invention determines in advance for each drawing format whether the depth value determined from the pixel group constituting the encoded block around the block to be encoded is applied to one of the two regions included in the drawing format. It is characterized by prescribing.

本発明は、前記描画形式に用いて蓄積される深度値は、符号化対象ブロックに含まれる全ての深度値を用いて近似したときに、入力ブロックとの歪みが最も小さくなる深度値とすることを特徴とする。 According to the present invention, the depth value accumulated using the drawing format is set to a depth value that minimizes distortion with the input block when approximated using all depth values included in the encoding target block. It is characterized by.

本発明は、複写形式１つと描画形式１つの組み合わせ方法は、各複写形式に基づいて近似ブロックを作成し、各描画形式に含まれる２つの領域のうち、周囲の画素群から採用する方の領域とは逆の領域のみを前記近似ブロックに上書きすることによって得ることを特徴とする。 In the present invention, the combination method of one copy format and one draw format is to create an approximate block based on each copy format, and from the two regions included in each draw format, the region that is adopted from the surrounding pixel group It is characterized in that it is obtained by overwriting only the region opposite to the approximate block.

本発明は、前記選択手段は、符号化対象ブロックの全画素に対し、入力ブロックとの歪みを最小とするものを選択することを特徴とする。 The present invention is characterized in that the selection means selects, for all pixels of the encoding target block, one that minimizes distortion with the input block.

本発明は、前記選択手段は、符号化対象ブロックの全画素に対し、入力ブロックとの歪みを、ブロックの端に近付くほど重み付けし、その重み付けした歪みを最小とするものを選択することを特徴とする。 In the present invention, the selection means weights the distortion of the input block with respect to all the pixels of the encoding target block as it approaches the end of the block, and selects the one that minimizes the weighted distortion. And

本発明は、前記選択手段は、符号化対象ブロックの全画素に対し、入力ブロックとの歪みを、ブロックの最下行および右端列のみ重み付けし、その重み付けした歪みを最小少とするものを選択することを特徴とする。 In the present invention, the selection means weights all the pixels of the encoding target block with respect to the distortion of the input block only for the bottom row and the rightmost column of the block, and selects the weighted distortion that is minimized. It is characterized by that.

本発明は、前記選択手段は、複写形式の中から１つ、あるいは複写形式の中から２つ、あるいは描画形式の中から１つ、あるいは複写形式１つと描画形式１つの組み合わせを１つ、のいずれかを選択することを特徴とする。 According to the present invention, the selection means has one copy format, two copy formats, one drawing format, or one combination of one copy format and one drawing format. One of them is selected.

本発明は、前記複写形式の中からの２つの選択は、その複写の順序を伴い、先に１つめの複写形式で複写した後、２つめの複写形式を、複写に用いる画素群のうち、それぞれが保持する深度値とは異なる深度値と接している画素群のみを、２つめの複写形式に対して用い、上書きすることを特徴とする。 In the present invention, the two selections from among the copying formats involve the order of copying, and after copying in the first copying format, the second copying format is used among the pixel groups used for copying. Only the pixel group in contact with a depth value different from the depth value held by each is used for the second copy format and overwritten.

本発明は、前記深度値量子化手段は、前記距離画像と対をなすテクスチャ画像の符号化の際に用いる量子化パラメータと対応づけられることを特徴とする。 The present invention is characterized in that the depth value quantizing means is associated with a quantization parameter used in encoding a texture image paired with the distance image.

本発明は、請求項１から１５のいずれかに記載された画像符号化装置により符号化された符号化距離画像を復号する画像復号装置であって、受信した前記符号化距離画像の符号語を解析する解析手段と、前記解析手段により解析して得られた深度値群を保持する保持手段と、前記解析手段により解析して得られた形式の識別情報と、前記深度群に基づき、所定の複写形式ま所定の描画形式を用いてブロック毎に前記距離画像を復元する復号手段とを備えたことを特徴とする。 The present invention is an image decoding apparatus for decoding an encoded distance image encoded by the image encoding apparatus according to any one of claims 1 to 15, wherein a codeword of the received encoded distance image is obtained. Analyzing means for analyzing, holding means for holding a depth value group obtained by analysis by the analyzing means, identification information in a format obtained by analyzing by the analyzing means, and predetermined depth based on the depth group Decoding means for restoring the distance image for each block using a predetermined drawing format up to a copy format is provided.

本発明は、コンピュータを請求項１から１５のいずれかに記載の画像符号化装置として機能させることを特徴とする。 The present invention causes a computer to function as the image encoding device according to any one of claims 1 to 15.

本発明は、コンピュータを請求項１６に記載の画像復号装置として機能させることを特徴とする。 According to the present invention, a computer is caused to function as the image decoding apparatus according to claim 16.

本発明は、距離画像の符号化データであって、画像の各ブロックに対し、ブロック周囲の符号化済み画素群を予め設定した複写形式に従って複写することによりそのブロックを近似し、あるいは予め用意した描画形式を用いることによってそのブロックを近似し、これら複写形式と描画形式から１つの形式を選択し、描画形式を選択した場合には、それに用いた深度値を蓄積し、選択した形式の番号および蓄積した深度値の情報に基づいて符号化したことを特徴とする。 The present invention is encoded image data of a distance image, and for each block of the image, the block is approximated or prepared in advance by copying an encoded pixel group around the block according to a preset copy format. The block is approximated by using the drawing format, and when one format is selected from the copy format and the drawing format, and the drawing format is selected, the depth value used for the selected format is stored, and the number of the selected format and The encoding is based on the accumulated depth value information.

本発明によれば、距離画像の符号化データの符号量を従来よりも削減することができる符号化装置およびこの符号化装置から供給された符号化データから距離画像を復号する復号装置を実現することができるという効果が得られる。 According to the present invention, an encoding device capable of reducing the code amount of encoded data of a distance image and a decoding device that decodes a distance image from encoded data supplied from the encoding device are realized. The effect that it can be obtained.

本発明の一実施形態の構成を示すブロック図である。It is a block diagram which shows the structure of one Embodiment of this invention. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 複写形式の種類の一例を示す説明図である。It is explanatory drawing which shows an example of the kind of copy format. 図８に示す矢印群のうちの一つを抜き出して示した説明図である。It is explanatory drawing which extracted and showed one of the arrow groups shown in FIG. 画素の複写を行った状態を示す説明図である。It is explanatory drawing which shows the state which performed the copy of a pixel. 距離画像を模式的に表現した説明図である。It is explanatory drawing which expressed the distance image typically. 図１２に示す距離画像をブロックに分割した説明図である。It is explanatory drawing which divided | segmented the distance image shown in FIG. 12 into the block. 描画形式の一例を示す説明図である。It is explanatory drawing which shows an example of a drawing format. 描画形式の一例を示す説明図である。It is explanatory drawing which shows an example of a drawing format. 一枚の画像に対し符号語生成部１６が生成する符号語の一例を示す説明図である。It is explanatory drawing which shows an example of the codeword which the codeword generation part 16 produces | generates with respect to one image. 図１６に示す符号語の構成（符号語生成規則）を示す説明図である。It is explanatory drawing which shows the structure (code word production | generation rule) of the code word shown in FIG. 符号化後のブロックを示す説明図である。It is explanatory drawing which shows the block after an encoding. 図１に示す装置構成の変形例を示すブロック図である。It is a block diagram which shows the modification of the apparatus structure shown in FIG.

以下、図面を参照して、本発明の一実施形態による画像符号化装置および画像復号装置を説明する。図１は同実施形態の構成を示すブロック図である。この図において、符号１は、距離画像を入力し、入力した距離画像に符号化処理を施して伝送路を介して伝送を行う画像符号化装置である。符号２は、伝送路を介して符号化処理が施された距離画像を受信し、符号化処理が施された距離画像を復号して距離画像を出力する画像復号装置である。画像符号化装置１は、分割部１１、処理判定部１２、複写形式判定部１３、描画形式判定部１４、深度値蓄積部１５及び符号語生成部１６から構成される。画像復号装置２は、符号語解析部２１、深度値保持部２２、複写形式展開部２３及び描画形式展開部２４とから構成する。 Hereinafter, an image encoding device and an image decoding device according to an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the embodiment. In this figure, reference numeral 1 denotes an image encoding device that inputs a distance image, performs an encoding process on the input distance image, and transmits the image via a transmission path. Reference numeral 2 denotes an image decoding apparatus that receives a distance image that has been subjected to encoding processing via a transmission path, decodes the distance image that has been subjected to encoding processing, and outputs a distance image. The image encoding device 1 includes a dividing unit 11, a process determining unit 12, a copy format determining unit 13, a drawing format determining unit 14, a depth value accumulating unit 15, and a code word generating unit 16. The image decoding apparatus 2 includes a codeword analysis unit 21, a depth value holding unit 22, a copy format development unit 23, and a drawing format development unit 24.

始めに、図１に示す画像符号装置１の処理動作を説明する。距離画像が入力されると、分割部１１は、入力した距離画像を複数のブロックに分割する。例えば１６×１６画素を１つのブロックとしてブロック分割を行う。そして、分割部１１は、ラスタスキャン順に、左上のブロックから順に、処理判定部に対して符号化対象ブロックとして出力する。処理判定部１２は、この符号化対象ブロックに対し、どの複写形式が最適か、あるいは、どの描画形式が最適か、そして、それら２つを併用する方がよいかを判定する。 First, the processing operation of the image encoding device 1 shown in FIG. 1 will be described. When the distance image is input, the dividing unit 11 divides the input distance image into a plurality of blocks. For example, block division is performed with 16 × 16 pixels as one block. Then, the dividing unit 11 outputs the blocks as an encoding target block to the processing determination unit in the raster scan order from the upper left block. The processing determination unit 12 determines which copy format is optimal for this encoding target block, which drawing format is optimal, and whether it is better to use the two in combination.

ここで、複写形式について説明する。図２〜図９は、複写形式の種類の一例である。図２〜図９において、右下に位置する１６×１６画素のブロックが符号化対象ブロックであり、それ以外が符号化済みの隣接ブロックである。図２〜図９において、各ブロック内の方眼一つ一つは画素を表現しており、矢印付きの線は、画素の複写先を表現している。例えば、図２において、符号化対象ブロックは、その上に隣接する符号化済みブロックの最下行の画素を複写して作成する。具体的には、符号化対象ブロックにおいて、左からｎ列目に位置する画素群は全て、上に隣接するブロックの最下行の左からｎ番目の画素を複写する。その他の図においても同様である。矢印の意味をさらに説明すると、例えば図８の矢印群のうちの一つを抜き出して示したものが図１０である。この場合、図１１に示すように黒く塗り潰して示した画素が、上に隣接するブロックの最下行左から９番目の画素を複写するということになる。 Here, the copy format will be described. 2 to 9 are examples of types of copy formats. 2 to 9, a 16 × 16 pixel block located at the lower right is an encoding target block, and other blocks are encoded adjacent blocks. 2 to 9, each grid in each block represents a pixel, and a line with an arrow represents a copy destination of the pixel. For example, in FIG. 2, the encoding target block is created by copying the pixel in the bottom row of the encoded block adjacent thereto. Specifically, in the encoding target block, all the pixel groups located in the nth column from the left copy the nth pixel from the left in the bottom row of the adjacent block above. The same applies to the other drawings. The meaning of the arrow will be further described. For example, FIG. 10 shows one extracted from the group of arrows in FIG. In this case, as shown in FIG. 11, the pixel shown in black is copied as the ninth pixel from the left of the bottom row of the adjacent block.

隣接画素をＨ．２６４規格のように、予測値として利用するのではなく、このように複写してそのまま利用することは、距離画像の符号化においては有効である。これは以下の理由による。すなわち、距離画像は被写体との距離を表しているため、同じ深度値の一まとまりの範囲は、ある程度大きくなる。そして、被写体の輪郭部分以外では、値が画素単位で急激に変化することは稀である。したがって、隣接ブロック同士で、同じ深度値を持つ確率が非常に高いからである。また、このように隣接画素を複写することによって、その隣接ブロックから符号化対象ブロックに亘って輪郭が連続している場合、その輪郭の連続性が保たれ、なおかつさまざまな方向の形式を用意しておくことにより、さまざまな方向に伸びる輪郭に対応することができる。 Neighboring pixels are H.264. As in the H.264 standard, it is effective to encode a distance image by copying it and using it as it is instead of using it as a predicted value. This is due to the following reason. That is, since the distance image represents the distance to the subject, the range of the same depth value is increased to some extent. Then, except for the contour portion of the subject, the value rarely changes abruptly in units of pixels. Therefore, the probability that the adjacent blocks have the same depth value is very high. In addition, by copying adjacent pixels in this way, when the contour is continuous from the adjacent block to the encoding target block, the continuity of the contour is maintained, and formats in various directions are prepared. By doing so, it is possible to deal with contours extending in various directions.

図１２は距離画像を模式的に表現した図であり、図１３は、図１２をブロックに分割した図である。図１３において、１つのブロックは、１６×１６画素のブロックを表している。例えば、ブロックＢ１〜Ｂ６に亘って水平に述びている境界線が、被写体の輪郭である。ブロックＢ２〜Ｂ４のブロックがそれぞれ符号化対象ブロックの時、図３に表される複写形式を選択して適用すれば、符号化対象ブロックを非常に良く近似できることは明らかである。 FIG. 12 is a diagram schematically representing a distance image, and FIG. 13 is a diagram obtained by dividing FIG. 12 into blocks. In FIG. 13, one block represents a 16 × 16 pixel block. For example, a boundary line horizontally extending across the blocks B1 to B6 is the contour of the subject. When the blocks B2 to B4 are the encoding target blocks, it is obvious that the encoding target block can be approximated very well if the copy format shown in FIG. 3 is selected and applied.

さらに、図１３に示すブロックＢ６のように、上側からと左側からとの両方から、輪郭が繋がっている場合などに対し、複写形式を、その順番とともに２種類選択してもよい。まず最初に、図３に示す複写形式によって複写を行った後、図２に示す複写形式によって複写を行い、既に同じ深度値で複写された列以外の列を上書きする、というルールを適用すれば、符号化対象ブロックを非常に良く近似できることは明らかである。 Furthermore, as in the case of the block B6 shown in FIG. 13, two types of copy formats may be selected along with the order in the case where the contour is connected from both the upper side and the left side. First of all, if a rule is applied in which copying is performed in the copying format shown in FIG. 3 and then copying is performed in the copying format shown in FIG. 2, and columns other than those already copied at the same depth value are overwritten. It is clear that the encoding target block can be approximated very well.

次に、描画形式判定部１４について説明する。図１４は、描画形式の一例であり、各正方形は、それぞれ１６×１６画素のブロックを表しており、その中にひかれた線は、深度値の境界を表している。描画形式Ｐ１は単一の深度値からなるブロックである。描画形式Ｐ２は、ブロックを水平方向に１：３の割合で垂直に区切った描画形式である。図１４に示す例では、一つのブロックに含まれる深度値の数は２であるという仮定をしたモデルとなっている。これにより、３つ以上の異なる深度値が一つのブロックに含まれる場合、２つの深度値に縮退してしまうことにはなるが、形式の数が限られるため、圧縮効率を高めることが可能となる。他の形式についても同様である。この形式の種類については、図１４に示したものに限らず、例えば図１５に示すような描画形式があってもよい。 Next, the drawing format determination unit 14 will be described. FIG. 14 shows an example of a drawing format. Each square represents a block of 16 × 16 pixels, and a line drawn therein represents a boundary of depth values. The drawing format P1 is a block composed of a single depth value. The drawing format P2 is a drawing format in which blocks are vertically divided at a ratio of 1: 3 in the horizontal direction. In the example shown in FIG. 14, the model assumes that the number of depth values included in one block is two. As a result, when three or more different depth values are included in one block, the depth is reduced to two depth values, but the number of formats is limited, so that the compression efficiency can be increased. Become. The same applies to other formats. The type of this format is not limited to that shown in FIG. 14, and for example, there may be a drawing format as shown in FIG.

描画形式は、深度値の境界のみ規定し、深度値については、描画形式ごとに符号化済みのどの隣接画素を使用するかを決めておく。例えば、図１４に示す描画形式Ｐ２については、左側に隣接する画素列の最も上側に位置する画素の値を左側の部分の深度値として、上側に隣接する画素行の最も右側に位置する画素の値を右側の部分の深度値として、それぞれ使用する。そして、各描画形式毎に、入力画像との歪み（深度値の差分の二乗和）を算出する。 The drawing format defines only the boundary of the depth value, and for the depth value, it is determined which adjacent pixel that has been encoded is used for each drawing format. For example, with respect to the drawing format P2 shown in FIG. 14, the value of the pixel located on the uppermost side of the pixel column adjacent to the left side is set as the depth value of the left part, and Use the value as the depth value for the right part. Then, for each drawing format, a distortion (sum of squares of differences in depth values) with the input image is calculated.

しかし、これだけでは、符号化済みの隣接ブロックには含まれない深度値が符号化対象ブロックに含まれる場合には、精度よく近似することはできないため、符号化対象ブロックに含まれる深度値の中で、先の計算によっては使用されなかったその他の深度値がある場合、それぞれを使用して同じように、歪みを算出する。このように、隣接ブロックには含まれない深度値を使用した場合、その深度値を深度値蓄積部１５に出力する。以上のように、描画形式判定部１４では、符号化対象ブロックに対し、最もよく深度値の境界線を近似する形式を選択する。 However, with this alone, when a depth value that is not included in the encoded adjacent block is included in the encoding target block, it cannot be accurately approximated. If there are other depth values that were not used in the previous calculation, the distortion is calculated in the same manner using each of the depth values. As described above, when a depth value not included in the adjacent block is used, the depth value is output to the depth value accumulation unit 15. As described above, the drawing format determination unit 14 selects the format that best approximates the boundary line of the depth value for the encoding target block.

処理判定部１２では、前述した描画形式判定部１４と複写形式判定部１２とから最適な使用方法を判定する。具体的には、符号化対象ブロックを、（１）複写形式判定部のみを使用し、隣接画素群による複写形式で近似する、（２）描画形式判定部のみ使用し、描画形式で近似する、（３）複写形式と描画形式の両方を使用して近似する、のいずれかを選択する。 The process determination unit 12 determines an optimum usage method from the drawing format determination unit 14 and the copy format determination unit 12 described above. Specifically, the encoding target block is approximated by (1) a copy format determination unit using only a copy format determination unit, and (2) is approximated by a drawing format using only a drawing format determination unit. (3) Select one of approximation using both the copy format and the drawing format.

ここで、（３）描画形式と複写形式の両方を使用して近似する方法について説明する。この場合、図２〜図９に示した複写形式それぞれに対し、図１４に示した描画形式を総当たりに組み合わせ、描画形式に含まれる２つの領域のうち、周囲の画素群から採用しない方の領域のみを上書きして符号化対象ブロックを作成する。そして、それぞれについて、符号化対象ブロック内全画素の、入力画像に対する歪みを計算する。最も歪みの少ない組み合わせが最適な組み合わせとなる。 Here, (3) a method of approximation using both the drawing format and the copy format will be described. In this case, for each of the copy formats shown in FIGS. 2 to 9, the drawing format shown in FIG. 14 is combined in a round robin manner, and the two regions included in the drawing format are not adopted from the surrounding pixel group. An encoding target block is created by overwriting only the area. Then, for each, the distortion with respect to the input image of all pixels in the encoding target block is calculated. The combination with the least distortion is the optimum combination.

次に、前述の近似方法（１）、（２）、（３）のいずれを選択するかについて説明する。まず、符号化対象ブロックの全画素に対する歪みを共通の基準とし、前述の近似方法（１）、（２）、（３）のそれぞれについて、その歪みを計算し、最も歪みの少ないものを選択するといった方法が考えられる。しかし、本発明の方法は、符号化済みブロックは、その後のブロックに伝播していくという特徴を有するため、ブロックの境界、特にその後に符号化対象となるブロックとの境界となる、右端列あるいは最下行に位置する画素についての歪みを少なくすることが重要となる。したがって、どの複写形式が最適かの判定については、符号化対象ブロックの右端の行に含まれる画素群と、最下行に含まれる画素群における、入力距離画像との歪み（差分の二乗和）に対し、他の部分と比べ重み付けした上で、各形式について算出し、最も歪みが小さくなる形式を最適な形式とする。あるいは、ブロックの境界に近いほど重み付けされるような関数を用いるなどしてもよい。このようにすることにより、符号化対象ブロックの右側や下側の隣接ブロックとの境界における輪郭のずれを少なくすることができるため、輪郭の連続性保持に有効である。 Next, which of the approximation methods (1), (2), and (3) is selected will be described. First, using the distortion for all the pixels of the encoding target block as a common reference, the distortion is calculated for each of the approximation methods (1), (2), and (3), and the one with the least distortion is selected. Such a method can be considered. However, since the method of the present invention has a feature that an encoded block is propagated to a subsequent block, the right end column or the boundary of the block, particularly the block to be encoded after that, It is important to reduce the distortion of the pixel located in the bottom row. Therefore, in determining which copy format is optimal, the distortion (sum of squares of the difference) between the input distance image in the pixel group included in the rightmost row of the encoding target block and the pixel group included in the lowermost row is determined. On the other hand, the weight is compared with the other parts and calculated for each format, and the format with the smallest distortion is set as the optimum format. Alternatively, a function that is weighted closer to the block boundary may be used. By doing so, it is possible to reduce the deviation of the outline at the boundary with the adjacent block on the right side or the lower side of the encoding target block, which is effective in maintaining the continuity of the outline.

ところで、画像内の最上行に位置するブロックについては、上側の隣接ブロックが存在しないため、図３、図９の２種類の形式と、図５、図６、図７の３種類の形式の、合計５種類の形式について計算を行う。図５〜７については、上側の隣接ブロックの画素群も参照しているが、符号化対象ブロックが画像内の最上行に位置している場合は、上側の隣接ブロックの画素群の代わりに、左側の隣接ブロックの画素群のうち、上端の画素を複写するものとする。同様に、画像内の左端列に位置するブロックについては、左側の隣接ブロックが存在しないため、図２、図４、図８の３種類の形式に加え、図５〜図７の３種類の、合計６種類の形式について計算を行う。先ほどと同様、参照できない画素については、上側の隣接画素群の左端の画素を用いる。そのほか、画像内の右端列に位置するブロックについては、図４と図８の形式について、上側の隣接画素群の右端の画素を用いる。ここで、上端左端のブロックについては符号化済みの隣接ブロックが存在しないため、複写形式は採用しない。 By the way, for the block located in the uppermost row in the image, there is no adjacent block on the upper side, so there are two types of formats of FIGS. 3 and 9 and three types of formats of FIGS. Calculation is performed for a total of five types. 5 to 7, the pixel group of the upper adjacent block is also referred to, but when the encoding target block is located in the uppermost row in the image, instead of the pixel group of the upper adjacent block, It is assumed that the uppermost pixel in the pixel group of the left adjacent block is copied. Similarly, for the block located in the left end column in the image, there is no adjacent block on the left side. Therefore, in addition to the three types of FIGS. 2, 4, and 8, the three types of FIGS. Calculations are made for a total of six types. As before, for the pixels that cannot be referred to, the pixel at the left end of the upper adjacent pixel group is used. In addition, for the block located in the right end column in the image, the right end pixel of the upper adjacent pixel group is used in the formats of FIGS. Here, since there is no encoded adjacent block for the upper left block, the copy format is not adopted.

次に、処理判定部１２は、前述の近似方法（１）、（２）、（３）のいずれによって符号化したかを示す情報ならびに、描画形式により符号化した場合は選択した描画形式を示す情報、深度値蓄積部１５に深度値を出力した場合は出力したという情報、また、複写形式により符号化した場合は選択した複写形式を示す情報を符号語生成部１６に対して出力する。 Next, the process determination unit 12 shows information indicating which of the above approximation methods (1), (2), and (3) is encoded, and the selected drawing format when encoded by the drawing format. When the depth value is output to the information and depth value storage unit 15, the output information is output to the codeword generation unit 16, and the information indicating the selected copy format is output to the codeword generation unit 16.

深度値蓄積部１５では、一枚の画像に含まれる全てのブロックが処理判定部１２によって符号化されるまで、入力された距離深度値を保持しておき、一枚の画像が符号化され終わると、蓄積した深度値群を符号語生成部１６に対して出力する。符号語生成部１６では、入力された深度値の情報に対し、「０」か「１」の二値で構成する符号語を割り当てる。図１６は、一枚の画像に対し符号語生成部１６が生成する符号語の一例である。図１６において、Ｘ１〜Ｘ５はそれぞれ二値からなる符号語を表している。ここでは、Ｘ１〜Ｘ５のそれぞれは、固定長であるとする。そして、それぞれが何ビットの固定長であるかについては、予め、あるいは、符号化の前などに復号側に伝送され、復号側で既知であるとする。図１７は、図１６に示す符号語の構成を示す図である。Ｘ１は、この符号化対象画像に関し、伝送される深度値の数を表す。例えば１０２４×７６８画素を有する画像を１６×１６画素のブロックに分割する場合、総ブロック数は３０７２個であるから、深度値は最大でも３０７２個となるため、それは１２ビットで表すことができる。 The depth value storage unit 15 holds the input distance depth value until all the blocks included in one image are encoded by the processing determination unit 12, and one image is completely encoded. Then, the accumulated depth value group is output to the codeword generation unit 16. The codeword generation unit 16 assigns a codeword composed of binary values “0” or “1” to the input depth value information. FIG. 16 is an example of a code word generated by the code word generation unit 16 for one image. In FIG. 16, X1 to X5 each represent a code word consisting of binary values. Here, it is assumed that each of X1 to X5 has a fixed length. It is assumed that the fixed length of each bit is transmitted to the decoding side in advance or before encoding or the like and is known on the decoding side. FIG. 17 is a diagram showing the configuration of the code word shown in FIG. X1 represents the number of transmitted depth values for this encoding target image. For example, when an image having 1024 × 768 pixels is divided into 16 × 16 pixel blocks, since the total number of blocks is 3072, the depth value is 3072 at the maximum, which can be represented by 12 bits.

Ｘ２は、Ｘ１で表される個数だけ、深度値を順に並べたものである。例えば、距離深度値が０〜２５５の値で表されている場合、それぞれの深度値は、８ビットで表すことができる。次に、符号化対象画像内のブロック数だけ、Ｘ３とＸ４との２つからなる符号語を繰り返す。Ｘ３は、前述の近似方法（１）、（２）、（３）のいずれによって符号化したかを示す情報であり、複写形式のみを使用した場合（前述の方法（１））は「０」、描画形式のみを使用した場合（前述の方法（２））は「１０」、複写形式と描画形式の両方を使用した場合（前述の方法（３））は「１１」となる。 X2 is the number of depth values arranged in order by the number represented by X1. For example, when the distance depth value is represented by a value of 0 to 255, each depth value can be represented by 8 bits. Next, the code word consisting of two of X3 and X4 is repeated by the number of blocks in the encoding target image. X3 is information indicating which of the approximation methods (1), (2), and (3) is used, and “0” when only the copy format is used (the above method (1)). When only the drawing format is used (the above-described method (2)) is “10”, and when both the copy format and the drawing format are used (the above-described method (3)) is “11”.

Ｘ４は、複写形式あるいは描画形式を識別する識別情報である。ここでは、複写形式を図２〜図９の８通り、描画形式を図１４に示した１３通りとし、Ｘ３にて近似方法（１）または近似方法（３）３が選択されたときには、複写形式の識別情報を表し、また、Ｘ３にて近似方法２が選択されたときには、描画形式の識別情報を表す。符号語長は複写形式表現には３ビット、描画形式表現には４ビットである。Ｘ５は、Ｘ３にて近似方法（３）が選択された場合に限り存在し、符号語長を同じく４ビットとして、描画形式の識別情報を表す。 X4 is identification information for identifying a copy format or a drawing format. Here, there are 8 types of copying formats shown in FIGS. 2 to 9 and 13 types of drawing formats shown in FIG. 14. When the approximation method (1) or approximation method (3) 3 is selected in X3, the copying format is selected. In addition, when the approximation method 2 is selected in X3, it represents the identification information of the drawing format. The codeword length is 3 bits for copy format representation and 4 bits for drawing format representation. X5 exists only when the approximation method (3) is selected in X3, and represents the identification information of the drawing format with the codeword length set to 4 bits.

以上説明した一連の処理動作によって、処理判定部１２が、入力距離画像をブロック毎に符号化する動作を図１３を参照して説明する。まず、ブロックＢ７のブロックが処理判定部１２に分割部１１から入力されると、このブロックには複写の対象となる符号化済み隣接ブロックの画素が存在しないので、処理判定部１２は描画形式判定部１４にこのブロックＢ７を出力し、最適な描画形式Ｐ１（図１４参照）を得る。このとき、参照する深度値は存在しないので、処理判定部１２は描画形式判定部１４に対し、このブロックを構成する単一の深度値（例えば値６０）を深度値蓄積部１５に出力させる。深度値蓄積部１５は、この値６０を内部に蓄積する。また、処理判定部１２は、描画形式Ｐ１を選択したという識別情報と、深度値蓄積部１５に値を蓄積したことを示す情報を、符号語作成部１６に対して出力する。符号語作成部１６が、図１７に示す符号語生成規則にしたがって符号語を生成すると、Ｘ３が「１０」、Ｘ４が「００００」という符号語が生成されることになる。 With reference to FIG. 13, an operation in which the process determination unit 12 encodes the input distance image for each block by the series of processing operations described above will be described. First, when the block B7 is input to the process determining unit 12 from the dividing unit 11, the process determining unit 12 determines the rendering format because there is no pixel of the encoded adjacent block to be copied. The block B7 is output to the unit 14 to obtain the optimum drawing format P1 (see FIG. 14). At this time, since there is no depth value to be referenced, the process determination unit 12 causes the drawing format determination unit 14 to output a single depth value (for example, value 60) constituting this block to the depth value storage unit 15. The depth value accumulation unit 15 accumulates this value 60 inside. In addition, the process determination unit 12 outputs identification information indicating that the drawing format P1 has been selected and information indicating that the value has been stored in the depth value storage unit 15 to the codeword creation unit 16. When the codeword creation unit 16 generates a codeword according to the codeword generation rule shown in FIG. 17, a codeword with X3 “10” and X4 “0000” is generated.

次に、その右隣のブロックが処理判定部１２に入力されると、図１７に示す符号語の割り当てが行われているとすると、複写形式を使用することが、割り当てるビット数を少なく抑えられるため、図３に示す複写形式を選択する。ここでは、図２〜図９に表す複写形式の識別情報はそれぞれ１〜８（「０００」〜「１１１」）に割り当てられているとする。この場合、図１７に示すように、最初のＸ３は０、Ｘ４は００１となる。さらに右隣のブロックについても、これと同様の処理となるので、Ｘ３は０、Ｘ４は００１となる。これが、ブロックＢ１の手前（図１３参照）まで繰り返される。ただし、２、３行目それぞれ左端のブロックについては、複写形式が図２で示される形式になるため、符号語Ｘ４は０００となる。また、２、３行目に含まれるブロックのうち、左端のブロック以外のブロックについては、複写形式が図２で示されるものと、図３で示されるものと、歪みは等しくなるため、いずれを選択してもよい。 Next, when the block on the right is input to the processing determination unit 12, assuming that the codeword shown in FIG. 17 is assigned, using the copy format can reduce the number of assigned bits. Therefore, the copy format shown in FIG. 3 is selected. Here, it is assumed that the copy format identification information shown in FIGS. 2 to 9 is assigned to 1 to 8 (“000” to “111”), respectively. In this case, as shown in FIG. 17, the first X3 is 0, and X4 is 001. Further, since the same processing is performed for the block on the right side, X3 is 0, and X4 is 001. This is repeated up to the block B1 (see FIG. 13). However, for the leftmost blocks of the second and third lines, the copy format is the format shown in FIG. Of the blocks included in the second and third rows, the block other than the leftmost block has the same distortion as the copy format shown in FIG. 2 and that shown in FIG. You may choose.

次に、ブロックＢ１が処理判定部１２に入力される。処理判定部１２は、複写形式判定部１３において各形式に対して歪みを計算させる。この歪みとは、ブロック内全画素の歪みでもよいし、上述したように、重み付けした歪みでもよい。このブロックの場合、どの形式においても歪みは一定となる。次に、処理判定部１２は、描画形式判定部１４において、各形式に対して歪みを計算させる。このとき、図１４に示す描画形式Ｐ１３の形式が最も歪みの少ないものとなる。符号語は、Ｘ３が１０、Ｘ４が１１００となる（描画形式Ｐ１３の識別情報を１３とした）。そして、ブロックＢ１の右下隅に含まれる深度値（例えば値９０）を深度値蓄積部１５に対して出力する。このときの符号化後のブロックは、図１８に示すブロックＢ１１のようになる。 Next, the block B1 is input to the process determination unit 12. The processing determination unit 12 causes the copy format determination unit 13 to calculate distortion for each format. This distortion may be distortion of all pixels in the block, or may be weighted distortion as described above. For this block, the distortion is constant in any format. Next, the process determination unit 12 causes the drawing format determination unit 14 to calculate distortion for each format. At this time, the drawing format P13 shown in FIG. 14 has the least distortion. The codeword is 10 for X3 and 1100 for X4 (the identification information of the drawing format P13 is 13). Then, the depth value (for example, value 90) included in the lower right corner of the block B1 is output to the depth value accumulation unit 15. The block after encoding at this time is a block B11 shown in FIG.

次に、ブロックＢ２〜ブロックＢ４が順に、前述した処理動作と同様に、処理判定部１２に入力される。その時の各符号語は、ブロックＢ２〜ブロックＢ４にかけて、同様に、複写形式判定部１３において図７に示す複写形式を選択し、符号語はＸ３が０、Ｘ４が００１となる。このときの符号化後のブロックは、図１８に示すブロックＢ２１〜ブロックＢ４１のようになる。 Next, the block B2 to the block B4 are sequentially input to the processing determination unit 12 in the same manner as the processing operation described above. Each code word at that time is similarly selected from the block B2 to the block B4, and the copy format determination unit 13 selects the copy format shown in FIG. 7, and X3 is 0 and X4 is 001. The blocks after encoding at this time are like blocks B21 to B41 shown in FIG.

次に、ブロックＢ５のブロックについて、複写形式判定部１３において、図７に示した形式が選択される。符号語はＸ３が０、Ｘ４が１０１となる。このときの符号化後のブロックは図１８に示すブロックＢ５１のようになる。ブロックＢ６のような入力ブロックに対しては、前述したように、２種類の複写形式を併用する。図１７に示す符号語生成規則のＸ４にはそのような場合の符号語を定義していないが、例えば１００１以降に、２つの複写形式の各組み合わせについて、符号語を定義すればよい。また、このブロックの場合、複写形式（図７）と描画形式（図１４の描画形式Ｐ３）とを組み合わせても、歪みは同程度となる。歪みが同じであればどちらを選択してもよい。 Next, for the block B5, the copy format determination unit 13 selects the format shown in FIG. The codeword is 0 for X3 and 101 for X4. The block after encoding at this time is a block B51 shown in FIG. For an input block such as block B6, as described above, two types of copying formats are used together. Although the code word in such a case is not defined in X4 of the code word generation rule shown in FIG. 17, for example, after 1001, code words may be defined for each combination of two copy formats. Further, in the case of this block, even when the copy format (FIG. 7) and the drawing format (drawing format P3 in FIG. 14) are combined, the distortion is approximately the same. Either may be selected as long as the distortion is the same.

このようにして、符号化処理を行い、一枚の画像について処理が終了した後、深度値蓄積部１５に蓄積された深度値群とその総数が符号語生成部１６に出力され、Ｘ１とＸ２が生成されて、符号語Ｘ１〜Ｘ５が符号化距離画像として伝送路を介して伝送されることになる。 In this way, after the encoding process is performed and the process is completed for one image, the depth value group accumulated in the depth value accumulation unit 15 and the total number thereof are output to the codeword generation unit 16, and X1 and X2 Are generated, and codewords X1 to X5 are transmitted as encoded distance images via a transmission path.

次に、図１に示す画像復号装置２の処理動作について説明する。符号語解析部２１は、伝送路を介して伝送された符号化距離画像を受信し、受信した符号化距離画像を符号語Ｘ１〜Ｘ５に分割し、Ｘ１、Ｘ２を深度値保持部２２に対して出力し、Ｘ３〜Ｘ５を複写形式展開部２３に出力する。深度値保持部２２は、必要に応じて、順に、深度値を描画形式展開部２４に出力する。複写形式展開部２３は、複写形式で符号化されたブロックに対して描画を行い、その結果を描画形式展開部２４に出力する。描画形式展開部２４は、描画形式で符号化されたブロックに対して描画を行う。このような処理によって、符号化側で符号化した入力距離画像を復号する。 Next, the processing operation of the image decoding device 2 shown in FIG. 1 will be described. The codeword analysis unit 21 receives the encoded distance image transmitted via the transmission path, divides the received encoded distance image into codewords X1 to X5, and transmits X1 and X2 to the depth value holding unit 22. And output X3 to X5 to the copy format developing unit 23. The depth value holding unit 22 sequentially outputs the depth values to the drawing format developing unit 24 as necessary. The copy format developing unit 23 performs drawing on the block encoded in the copy format, and outputs the result to the drawing format developing unit 24. The drawing format development unit 24 performs drawing on the block encoded in the drawing format. By such processing, the input distance image encoded on the encoding side is decoded.

図１３を参照して前述した具体例に沿って説明する。まず、Ｘ１とＸ２については、深度値保持部２２に出力する。このとき、Ｘ２の先頭は、ブロックＢ７を描画形式で符号化したときの深度値６０である。同様に、２番目は、ブロックＢ１を描画形式で符号化したときの深度値９０である。 Description will be made along the specific example described above with reference to FIG. First, X1 and X2 are output to the depth value holding unit 22. At this time, the head of X2 is a depth value 60 when the block B7 is encoded in the drawing format. Similarly, the second is the depth value 90 when the block B1 is encoded in the drawing format.

次に、ブロックＢ７に対する符号語１０００００が描画形式展開部２４に入力される。描画形式展開部２４はＸ３が１０、Ｘ４が００００であると解析し、描画形式Ｐ１を選択するとともに、深度値保持部２２から先頭の深度値６０を取得する。そして、深度値６０を用いて描画形式Ｐ１の形式で、最初のブロックを描画する。また、そのＸ３とＸ４を複写形式展開部２３に出力する。複写形式展開部２３は、Ｘ３が１０であることから何も処理を行わず、このブロックの復号を終了する。 Next, the code word 100000 for the block B 7 is input to the drawing format development unit 24. The drawing format developing unit 24 analyzes that X3 is 10 and X4 is 0000, selects the drawing format P1, and acquires the first depth value 60 from the depth value holding unit 22. Then, the first block is drawn in the drawing format P1 using the depth value 60. The X3 and X4 are output to the copy format developing unit 23. The copy format developing unit 23 does not perform any processing since X3 is 10, and ends the decoding of this block.

次に、ブロックＢ７の右隣のブロックに対する符号語０００１が描画形式展開部２４に入力される。描画形式展開部２４はＸ３が０、Ｘ４が００１であると解析し、何も処理を行わず、Ｘ３、Ｘ４を複写形式展開部に出力する。複写形式展開部は、図７の複写形式を用い、ブロックＢ７の右端列に含まれる画素群を水平方向に複写する。このような処理を順次行い、復号することにより符号化距離画像が復号されて距離画像が復元されることになる。 Next, the code word 0001 for the block on the right side of the block B 7 is input to the drawing format development unit 24. The drawing format developing unit 24 analyzes that X3 is 0 and X4 is 001, and outputs X3 and X4 to the copy format developing unit without performing any processing. The copy format developing unit uses the copy format shown in FIG. 7 to copy the pixel group included in the right end column of the block B7 in the horizontal direction. By sequentially performing such processing and decoding, the encoded distance image is decoded and the distance image is restored.

次に、図１９を参照して、図１に示す画像符号化装置１と画像復号装置２の変形例を説明する。図１９は、図１に示す画像符号化装置１と画像復号装置２を変形した構成を示すブロック図である。図１９に示す画像符号化装置１が図１に示す画像符号化装置１と異なる点は、新たに深度量子化部１７とエントロピー符号化部１８を設けた点である。また、図１９に示す画像復号装置２が図１に示す画像復号装置２と異なる点は、エントロピー復号部２５を新たに設けた点である。 Next, a modified example of the image encoding device 1 and the image decoding device 2 shown in FIG. 1 will be described with reference to FIG. FIG. 19 is a block diagram showing a modified configuration of the image encoding device 1 and the image decoding device 2 shown in FIG. The image coding apparatus 1 shown in FIG. 19 is different from the image coding apparatus 1 shown in FIG. 1 in that a depth quantization unit 17 and an entropy coding unit 18 are newly provided. Further, the image decoding device 2 shown in FIG. 19 is different from the image decoding device 2 shown in FIG. 1 in that an entropy decoding unit 25 is newly provided.

深度量子化部１７は、分割部１１から出力するブロックの深度値を量子化する。量子化ステップについては、予め規定してもよいし、例えばＨ．２６４規格などで用いられている量子化パラメータｑＰを用い、その値に対応づけてもよい。あるいは、この距離画像と対応するテクスチャ画像の符号化の際に用いるｑＰの値と対応づけてもよい。量子化パラメータｑＰとの対応づけは、例えば量子化パラメータｑＰの最大値５１のときの量子化ステップを決め（例えば１６＝２の４乗）、それを基に各ｑＰに対する量子化ステップを決めるようにしてもよい。この場合量子化ステップｓは、
ｓ＝２^{４＋ｆｌｏｏｒ（５１−ｑP／６）}
と表すことができる。ここでｆｌｏｏｒ（ｘ）はｘを越えない最大の整数を表す関数である。 The depth quantization unit 17 quantizes the depth value of the block output from the dividing unit 11. The quantization step may be defined in advance. The quantization parameter qP used in the H.264 standard or the like may be used and associated with the value. Or you may make it match | combine with the value of qP used at the time of the encoding of the texture image corresponding to this distance image. For the association with the quantization parameter qP, for example, the quantization step at the maximum value 51 of the quantization parameter qP is determined (for example, 16 = 2 to the fourth power), and the quantization step for each qP is determined based on the quantization step. It may be. In this case, the quantization step s is
s = 2 ^{4 + floor (51-qP / 6)}
It can be expressed as. Here, floor (x) is a function representing the maximum integer not exceeding x.

このように、先に深度値を適切に量子化しておくことによって、距離画像が単純化され、その後の、描画形式と複写形式を用いた符号化の精度を向上させることができる。さらに、このように事前に量子化を行っておくことによって、図１７に示すＸ２に割り当てるビット数を、その量子化ステップｓを表現するのに十分なビット数に制限することができ、さらなる情報の圧縮となる。 Thus, by appropriately quantizing the depth value in advance, the distance image can be simplified, and the accuracy of the subsequent encoding using the drawing format and the copy format can be improved. Further, by performing quantization in advance in this way, the number of bits allocated to X2 shown in FIG. 17 can be limited to a number of bits sufficient to express the quantization step s. It becomes compression of.

エントロピー符号化部１８は、符号語生成部１６によって生成した符号化距離画像を、エントロピー符号化することによって情報をさらに圧縮する。エントロピー復号部２５は、エントロピー符号化された符号化距離画像を復号する。この方式としては、算術符号化や辞書式符号化、それにそれぞれの発生確率テーブルやコードブックを適応的に更新する適応的算術符号化、適応的辞書式符号化などが適用可能である。また、この方式は一枚の距離画像を圧縮符号化するための方式であるが、異なる時刻に対応する複数の画像間の冗長性の除去に関しては、Ｈ．２６４規格を適用することもできる。すなわち、Ｈ．２６４規格におけるＩフレームのみ、本発明の方式を採用し、ＢフレームやＰフレームに関してはＨ．２６４規格を使用するなどしてもよい。 The entropy encoder 18 further compresses the information by entropy encoding the encoded distance image generated by the codeword generator 16. The entropy decoding unit 25 decodes the encoded distance image that has been entropy encoded. As this method, arithmetic coding, lexicographic coding, adaptive arithmetic coding for adaptively updating each occurrence probability table and codebook, adaptive lexicographic coding, and the like can be applied. This method is a method for compressing and encoding a single distance image. Regarding the removal of redundancy between a plurality of images corresponding to different times, H. The H.264 standard can also be applied. That is, H.I. Only the I frame in the H.264 standard adopts the method of the present invention, and the B frame and P frame are H.264. The H.264 standard may be used.

なお、前述した説明においては、ブロックサイズを１６×１６画素として説明したが、このサイズに限らず、８×８画素や４×４画素でもよい。さらに、Ｈ．２６４規格のマクロブロックと同様、ブロック単位でそのサイズを可変にしてもよい。それらの場合においても、複写形式については変わらず使用でき、描画形式については、そのまま縮尺したものを使用すればよい。さらに、１６×８画素や８×１６画素など、長方形のブロックに対しても同じ方式が使用できる。この場合、複写形式については、その長方形が該当する画素のみを用いればよいし、描画形式については、正方形からそれら長方形に線形縮尺したものを使用すればよい。これら、さまざまなサイズ、形のブロックのいずれを使うかの決定は、例えば、１６×１６画素のブロック毎に、最も歪みが小さくなるブロック形状、サイズ、複写形式あるいは描画形式、の組み合わせの中から、歪みを最少にするものを選択することによって行われるなどしてよい。この場合、図１に示す分割部１１から処理判定部に伸びる矢印は、さまざまな形状に分割したデータを分割から処理判定部１２に出力する複数のデータの流れを意味することになる。 In the above description, the block size is described as 16 × 16 pixels. However, the block size is not limited to this and may be 8 × 8 pixels or 4 × 4 pixels. Further, H.C. Like the macro block of the H.264 standard, the size may be variable in units of blocks. In these cases, the copy format can be used without change, and the drawing format may be reduced as it is. Furthermore, the same method can be used for rectangular blocks such as 16 × 8 pixels and 8 × 16 pixels. In this case, as for the copy format, only the pixels corresponding to the rectangle may be used, and as the drawing format, a linear scale from square to those rectangles may be used. Determination of which of these various sizes and shapes to use is made, for example, from combinations of block shape, size, copy format or drawing format with the smallest distortion for each block of 16 × 16 pixels. This may be done by selecting the one that minimizes the distortion. In this case, an arrow extending from the dividing unit 11 to the process determining unit illustrated in FIG. 1 means a flow of a plurality of data output from the divided data to the process determining unit 12 after being divided into various shapes.

以上説明したように、距離画像の符号化データの符号量を従来よりも削減することができる符号化装置およびこの符号化装置から供給された符号化データから距離画像を復号する復号装置を実現することができる。 As described above, an encoding device capable of reducing the code amount of encoded data of a distance image as compared with the prior art and a decoding device that decodes a distance image from encoded data supplied from the encoding device are realized. be able to.

なお、図１、１９における各処理部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより画像符号化処理・画像復号処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータシステム」は、ホームページ提供環境（あるいは表示環境）を備えたＷＷＷシステムも含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。 A program for realizing the functions of the processing units in FIGS. 1 and 19 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed to execute an image. You may perform an encoding process and an image decoding process. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer system” includes a WWW system having a homepage providing environment (or display environment). The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Further, the “computer-readable recording medium” refers to a volatile memory (RAM) in a computer system that becomes a server or a client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, those holding programs for a certain period of time are also included.

また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line. The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, what is called a difference file (difference program) may be sufficient.

距離画像の符号化・復号を行うことが不可欠な用途に適用できる。 The present invention can be applied to applications where it is indispensable to encode / decode range images.

１・・・画像符号化装置、１１・・・分割部、１２・・・処理判定部、１３・・・複写形式判定部、１４・・・描画形式判定部、１５・・・深度値蓄積部、１６・・・符号語生成部、１７・・・深度値量子化部、１８・・・エントロピー符号化部、２・・・画像復号装置、２１・・・符号語解析部、２２・・・深度値保持部、２３・・・複写形式展開部、２４・・・描画形式展開部、２５・・・エントロピー復号部 DESCRIPTION OF SYMBOLS 1 ... Image coding apparatus, 11 ... Division | segmentation part, 12 ... Processing determination part, 13 ... Copy format determination part, 14 ... Drawing format determination part, 15 ... Depth value storage part , 16 ... codeword generation unit, 17 ... depth value quantization unit, 18 ... entropy coding unit, 2 ... image decoding device, 21 ... codeword analysis unit, 22 ... Depth value holding unit, 23 ... Copy format developing unit, 24 ... Drawing format developing unit, 25 ... Entropy decoding unit

Claims

An image encoding device for encoding a range image,
Dividing means for dividing the input distance image into rectangular blocks of a predetermined size;
Copy approximating means for approximating the encoding target block by copying a group of pixels constituting the encoded block around the encoding target block divided by the dividing means based on a predetermined copy format;
A drawing format approximating unit for approximating the encoding target block by using a predetermined drawing format for the encoding target block divided by the dividing unit, and storing depth value information of the used drawing format;
A selection means for selecting one of the copy approximation means and the drawing format approximation means;
Codeword generation means for transmitting codewords generated based on the format identification information of the copy format or drawing format selected for the block to be encoded and the accumulated depth value information, Image encoding device.

The image coding apparatus according to claim 1, further comprising a depth quantization unit that quantizes a depth value of the block divided by the division unit.

The image encoding apparatus according to claim 1, wherein the drawing format includes two depth values and defines only a boundary between the depth values.

The selection means selects one from among copy formats, one from among drawing formats, or one combination of one copying format and one drawing format. The image encoding device according to any one of the above.

The image coding apparatus according to claim 3, wherein one of the two depth values is determined from a pixel group constituting a coded block around a coding target block.

6. The image encoding apparatus according to claim 5, wherein a depth value determined from a pixel group constituting an encoded block around the encoding target block is determined in advance from a pixel position defined for each drawing format. .

Predetermining in advance for each drawing format whether to apply a depth value determined from a group of pixels constituting an encoded block around the encoding target block to one of two regions included in the drawing format. The image coding apparatus according to claim 5, wherein

The depth value accumulated using the drawing format is set to a depth value that minimizes distortion with the input block when approximated using all depth values included in the encoding target block. The image encoding device according to claim 5.

The method of combining one copy format and one drawing format creates an approximate block based on each copy format, and is the opposite of the region adopted from the surrounding pixel group among the two regions included in each drawing format. The image coding apparatus according to claim 4, wherein only the area is obtained by overwriting the approximate block.

The image encoding device according to any one of claims 1 to 9, wherein the selection unit selects a pixel that minimizes distortion with respect to an input block with respect to all pixels of an encoding target block.

The selection means weights the distortion of the input block with respect to all the pixels of the encoding target block as it approaches the end of the block, and selects the one that minimizes the weighted distortion. The image encoding device according to any one of 1 to 9.

The selection means weights only the bottom row and right end column of the block with respect to the distortion of the input block with respect to all the pixels of the encoding target block, and selects the weighted distortion that is minimized. An image encoding device according to any one of claims 1 to 9.

The selection means selects one of the copy formats, two of the copy formats, one of the drawing formats, or one combination of one copying format and one drawing format. The image encoding device according to claim 1, wherein the image encoding device is an image encoding device.

The two selections from among the copying formats involve the order of copying, and after copying in the first copying format first, each of the pixel groups used for copying holds the second copying format. 14. The image encoding apparatus according to claim 13, wherein only the pixel group that is in contact with a depth value different from the depth value is used for the second copy format and overwritten.

3. The image coding apparatus according to claim 2, wherein the depth value quantization means is associated with a quantization parameter used when coding a texture image paired with the distance image.

An image decoding device that decodes an encoded distance image encoded by the image encoding device according to any one of claims 1 to 15,
Analyzing means for analyzing the codeword of the received encoded distance image;
Holding means for holding a depth value group obtained by analysis by the analyzing means;
And a decoding unit that restores the distance image for each block by using a predetermined copy format or a predetermined drawing format based on the identification information obtained by the analysis unit and the depth group. An image decoding apparatus characterized by the above.

An image encoding program for causing a computer to function as the image encoding apparatus according to any one of claims 1 to 15.

An image decoding program that causes a computer to function as the image decoding apparatus according to claim 16.

Encoded data of distance image, and for each block of the image, the block is encoded by approximating the encoded pixel group around the block according to a preset copy format, or a drawing format prepared in advance is used. The block is approximated by selecting one format from the copy format and the drawing format, and when the drawing format is selected, the depth value used for it is stored, and the number of the selected format and the accumulated depth value are stored. Coded data characterized by being encoded based on the information.