JP2006295408A

JP2006295408A - Image encoding apparatus and image encoding program

Info

Publication number: JP2006295408A
Application number: JP2005111310A
Authority: JP
Inventors: Kazuya Takagi; 一也高木; Yoshimasa Honda; 義雅本田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2005-04-07
Filing date: 2005-04-07
Publication date: 2006-10-26

Abstract

【課題】線状ノイズが発生するおそれがある予測モードを選択せず、復号画像に線状ノイズの発生を抑制することができる画像符号化装置を提供する。
【解決手段】入力画像をブロック単位に分割し、イントラ予測を行って選択される予測モードに従って生成される予測画像との差分を符号化する画像符号化装置であって、符号化対象ブロックに隣接するブロックの予測モードを参照する隣接予測モード参照部４０１と、参照された隣接予測モードに応じて所定のコスト関数に切替える命令を出力するコスト関数切替え部４０２と、コスト関数切替え部４０２から出力されたコスト関数切替え命令に従って所定のコスト関数に切り替え、当該コスト関数を用いた計算処理の結果に基づいて選択した予測モードを用いて予測画像を生成するイントラ予測部１０１とを備える。
【選択図】図１PROBLEM TO BE SOLVED: To provide an image encoding device capable of suppressing the occurrence of linear noise in a decoded image without selecting a prediction mode in which linear noise may occur.
An image encoding apparatus that divides an input image into blocks and encodes a difference from a prediction image generated according to a prediction mode selected by performing intra prediction, and is adjacent to an encoding target block Output from an adjacent prediction mode reference unit 401 that refers to a prediction mode of a block to be performed, a cost function switching unit 402 that outputs an instruction to switch to a predetermined cost function according to the referenced adjacent prediction mode, and a cost function switching unit 402 An intra prediction unit 101 that switches to a predetermined cost function according to the cost function switching command and generates a predicted image using a prediction mode selected based on a result of calculation processing using the cost function.
[Selection] Figure 1

Description

本発明は、Ｈ．２６４／ＡＶＣイントラ符号化を実現する画像符号化装置及び画像符号化プログラムに関する。 The present invention relates to H.264. The present invention relates to an image encoding device and an image encoding program for realizing H.264 / AVC intra encoding.

Ｈ．２６４／ＡＶＣは、ＩＴＵ−ＴおよびＩＳＯ／ＩＥＣによって策定された動画像の符号化標準であり、従来の符号化標準であるＭＰＥＧ−４やＨ．２６３に比べて最大で約２倍の圧縮効率を持つ。また、Ｈ．２６４／ＡＶＣは動画像の符号化方式であるが、静止画像の符号化方式、すなわち、イントラ符号化としての性能も優れていることが非特許文献１乃至３等で報告されている。イントラ符号化を行って生成するイントラフレームと呼ばれる画像は、１枚の画像単独で圧縮されているため、画像検索におけるランダムアクセス性や画像の編集性という点で優れている。以下、Ｈ．２６４／ＡＶＣイントラ符号化の概要を説明する。
ＫｏｔａｒｏＡｓａｉ， ”Ｉｎｔｒａ−ｏｎｌｙｖｉｄｅｏｃｏｄｉｎｇ”，ＭＰＥＧ２００３／Ｍ１００７５ＫｏｔａｒｏＡｓａｉ， ”Ｉｎｔｒａ−ｏｎｌｙｖｉｄｅｏｃｏｄｉｎｇ”，ＭＰＥＧ２００３／Ｍ１０２４６ＫｏｔａｒｏＡｓａｉ， ”Ｉｎｔｒａ−ｏｎｌｙｖｉｄｅｏｃｏｄｉｎｇ”，ＭＰＥＧ２００３／Ｍ１０３２０図３は、Ｈ．２６４／ＡＶＣイントラ符号化を実現する符号化装置のブロック図である。図において、イントラ予測部１０１では、フレームメモリ１０９に保存されている復号画像を入力してイントラ予測を行う。隣接画素間の高い相関を利用した処理を行い、入力される復号画像を用いて、Ｈ．２６４／ＡＶＣで規定されている複数の予測モードから１つのモードを選択し、予測画像と呼ばれる画像を生成する。 H. H.264 / AVC is a moving picture coding standard established by ITU-T and ISO / IEC, and is a conventional coding standard such as MPEG-4 and H.264. Compared with H.263, the compression efficiency is about twice as high. H. H.264 / AVC is a moving image coding method, but it has been reported in Non-Patent Documents 1 to 3 and the like that it has excellent performance as a still image coding method, that is, intra coding. An image called an intra frame generated by performing intra coding is compressed by a single image, and is excellent in terms of random accessibility and image editability in image search. Hereinafter, H.C. An outline of H.264 / AVC intra coding will be described.
Kotaro Asai, “Intra-only video coding”, MPEG2003 / M10075 Kotaro Asai, “Intra-only video coding”, MPEG2003 / M10246 Kotaro Asai, “Intra-only video coding”, MPEG 2003 / M 10320 FIG. 1 is a block diagram of an encoding device that realizes H.264 / AVC intra coding. In the figure, an intra prediction unit 101 inputs a decoded image stored in a frame memory 109 and performs intra prediction. A process that uses a high correlation between adjacent pixels is performed, and an H. One mode is selected from a plurality of prediction modes defined by H.264 / AVC, and an image called a predicted image is generated.

Ｈ．２６４／ＡＶＣで規定されている複数の予測モードには、４×４ブロックの９モードと１６×１６ブロックの４モードとの１３モードがあり、この中から最適となる予測モードを選択する。図４は４×４ブロック、図５は１６×１６ブロックの予測方向を示した概念図である。図４に示すように、４×４ブロックは有向予測の８モードと無向予測の１モードの合計９モードを有し、図５に示すように、１６×１６ブロックは有向予測の３モードと無向予測の１モードの合計４モードを有している。図に示すように、予測方向毎に付される番号がモード番号を示している。 H. The plurality of prediction modes defined by H.264 / AVC include 13 modes of 9 modes of 4 × 4 blocks and 4 modes of 16 × 16 blocks, and an optimal prediction mode is selected from these. 4 is a conceptual diagram showing the prediction direction of 4 × 4 blocks, and FIG. 5 is a prediction direction of 16 × 16 blocks. As shown in FIG. 4, the 4 × 4 block has a total of 9 modes including 8 modes of directed prediction and 1 mode of undirected prediction, and as shown in FIG. 5, the 16 × 16 block has 3 directed predictions. There are a total of four modes, one mode and one undirected prediction mode. As shown in the figure, the number assigned to each prediction direction indicates the mode number.

図６は、イントラ予測部１０１で行う処理を示すフロー図である。はじめにステップ２１において、イントラ予測部１０１はマクロブロックを４×４ブロック毎に区分けしてイントラ予測を行う。上記の通り、４×４ブロックのイントラ予測は、４×４ブロックで規定されている９モードから選択する処理を行う。４×４ブロックの予測モードの決定に関する処理については後述する。 FIG. 6 is a flowchart showing processing performed by the intra prediction unit 101. First, in step 21, the intra prediction unit 101 performs intra prediction by dividing a macroblock into 4 × 4 blocks. As described above, the 4 × 4 block intra prediction performs a process of selecting from nine modes defined by the 4 × 4 block. Processing related to determination of the prediction mode of 4 × 4 blocks will be described later.

次に、ステップ２２において、イントラ予測部１０１はマクロブロックを１６×１６ブロック毎に区分けしてイントラ予測を行う。上記の通り、１６×１６ブロックのイントラ予測は、１６×１６ブロックで規定されている４モードから選択する処理を行う。１６×１６ブロックの予測モードの決定に関する処理動作についても後述する。 Next, in step 22, the intra prediction unit 101 performs intra prediction by dividing the macroblock into 16 × 16 blocks. As described above, in the 16 × 16 block intra prediction, a process of selecting from the four modes defined by the 16 × 16 block is performed. The processing operation relating to the determination of the prediction mode of 16 × 16 blocks will also be described later.

最後に、ステップ２３において、イントラ予測部１０１は４×４ブロックと１６×１６ブロックのイントラ予測でそれぞれ決定したモードを用い、どちらを用いるかを決定する。決定方法は規格で定められていないが、一般に、４×４ブロックと１６×１６ブロックの発生符号量を比較して小さい方の予測モードを選択することにしている。 Finally, in step 23, the intra prediction unit 101 uses the modes determined in the 4 × 4 block and 16 × 16 block intra prediction, and determines which one to use. The determination method is not defined by the standard, but generally, the smaller prediction mode is selected by comparing the generated code amounts of the 4 × 4 block and the 16 × 16 block.

図７は、イントラ予測部１０１がステップ２１で行う４×４ブロックのイントラ予測の処理を示すフロー図である。ステップ３１において、イントラ予測部１０１は４×４ブロックで規定されている個々の予測モードを用いて予測画像を生成する。図８は予測画像の生成を説明するための図である。図において、Ｐは予測画像、ＸとＹは予測画像の上または左に位置する隣接ブロックである。予測画像Ｐに属する画素ａ〜ｐは、予測モードが示す方向へ画素Ａ〜Ｍを外挿補間することで決定される。具体的な決定例を図９と図１０に示す。 FIG. 7 is a flowchart showing the 4 × 4 block intra prediction process performed by the intra prediction unit 101 in step 21. In step 31, the intra prediction unit 101 generates a prediction image using each prediction mode defined by 4 × 4 blocks. FIG. 8 is a diagram for explaining generation of a predicted image. In the figure, P is a predicted image, and X and Y are adjacent blocks located above or to the left of the predicted image. The pixels a to p belonging to the predicted image P are determined by extrapolating the pixels A to M in the direction indicated by the prediction mode. Specific determination examples are shown in FIGS.

図９は、図４に示す予測モードの内、垂直方向への予測を示すモード０が選択された場合に生成される予測画像である。この場合、予測画像は隣接する画素Ａ〜Ｄの画素値を垂直方向に外挿して生成する。また、図１０は図４に示す予測モードの内、水平方向への予測を示すモード１が選択された場合に生成される予測画像である。この場合、予測画像は隣接する画素Ｉ〜Ｌの画素値を水平方向に外挿して生成する。 FIG. 9 is a prediction image generated when mode 0 indicating prediction in the vertical direction is selected from the prediction modes shown in FIG. In this case, the predicted image is generated by extrapolating the pixel values of adjacent pixels A to D in the vertical direction. FIG. 10 is a prediction image generated when the mode 1 indicating the prediction in the horizontal direction is selected from the prediction modes shown in FIG. In this case, the predicted image is generated by extrapolating the pixel values of the adjacent pixels I to L in the horizontal direction.

次に、ステップ３２において、イントラ予測部１０１は、ステップ３１で生成された予測画像の符号化に要するコストを計算する。非特許文献４に示されるリファレンスソフトで規定されているコスト関数Ｊの定義式を示す。
ＫａｒｓｔｅｎＳｕｈｒｉｎｇ， ”Ｈ．２６４／ＡＶＣＳｏｆｔｗａｒｅＣｏｏｒｄｉｎａｔｉｏｎ”，ｈｔｔｐ：／／ｂｓ．ｈｈｉ．ｄｅ／ｓｕｅｈｒｉｎｇ／ｔｍｌ Next, in step 32, the intra prediction unit 101 calculates the cost required for encoding the prediction image generated in step 31. The definition formula of the cost function J prescribed | regulated by the reference software shown by the nonpatent literature 4 is shown.
Karsten Suhring, “H.264 / AVC Software Coordination”, http: // bs. hhi. de / suehring / tml

式１において、Ｄは入力画像と予測画像の類似度を示す歪み量、Ｒは予測モードをヘッダに記述する際に要するヘッダ符号量、ＱＰは圧縮率を示す量子化パラメータである。歪み量は、入力画像と予測画像のＳＡＤ（ＳｕｍｏｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃｅｓ）やＳＳＤ（ＳｕｍｏｆＳｑｕａｒｅｄＤｉｆｆｅｒｅｎｃｅｓ）で計算することができる。ＳＡＤとＳＳＤの定義式を式２と式３に示す。式２と式３において、Ｏは入力画像、Ｐは予測画像、ｉは画像の位置を示すインデックスである。

In Equation 1, D is a distortion amount indicating the similarity between the input image and the predicted image, R is a header code amount required for describing the prediction mode in the header, and QP is a quantization parameter indicating the compression rate. The amount of distortion can be calculated by SAD (Sum of Absolute Differences) or SSD (Sum of Squared Differences) of the input image and the predicted image.

Formulas

2 and 3 show definition formulas for SAD and SSD. In

Expressions

2 and 3, O is an input image, P is a predicted image, and i is an index indicating the position of the image.

また、ヘッダ符号量は、隣接ブロックの予測モード番号との相対関係により長さが変わる。具体的には、上または左に位置する隣接ブロックのモード番号の内、小さい番号の予測モードと同じモードを選択すると１ビット、その他のモードを選択すると４ビットになる。

Also, the length of the header code amount varies depending on the relative relationship with the prediction mode number of the adjacent block. Specifically, among the mode numbers of adjacent blocks located above or to the left, 1 bit is selected when the same mode as the prediction mode with a lower number is selected, and 4 bits when other modes are selected.

図１１は、Ｈ．２６４／ＡＶＣで規定されているヘッダのシンタックスである。ｐｒｅｖ＿Ｉｎｔｒａ４ｘ４＿ｐｒｅｄ＿ｍｏｄｅ＿ｆｌａｇは１ビットの情報であり、上または左に位置する隣接ブロックのモード番号が小さい方と同じモードを選択した場合には１の値を、異なるモードを選択した場合は０の値を取る。同じモードを選択した場合、復号化装置は隣接ブロックの予測モードから予測対象ブロックの予測モードを知ることができるので１ビットで十分である。一方、異なるモードを選択した場合、符号化装置は全９モードから隣接ブロックの予測モードを引いた８モードをｒｅｍ＿ｐｒｅｖ＿Ｉｎｔｒａ４ｘ４＿ｐｒｅｄ＿ｍｏｄｅ＿ｆｌａｇの３ビットで記述する。これは、隣接ブロック間の予測モードの高い相関を利用した情報圧縮である。具体例として、上と左に位置する隣接ブロックのモード番号がそれぞれ４と８の場合、予測対象ブロックでモード番号４を選択すると１ビット、それ以外を選択すると４ビットとなる。以上をまとめると、式１は入力画像と予測画像の類似度Ｄならびに予測モードの記述に要するヘッダ符号量Ｒの双方を考慮していることを示しており、イントラ予測部１０１では４×４ブロックで規定されている全９モードを用いてコスト計算を行い、その中で値が最小となる予測モードを選択する。 FIG. This is a header syntax defined by H.264 / AVC. prev_Intra4x4_pred_mode_flag is 1-bit information, and takes a value of 1 when the same mode as the one with the smaller mode number of the adjacent block located above or to the left is selected, and takes a value of 0 when a different mode is selected. When the same mode is selected, since the decoding apparatus can know the prediction mode of the prediction target block from the prediction mode of the adjacent block, one bit is sufficient. On the other hand, when a different mode is selected, the encoding apparatus describes 8 modes obtained by subtracting the prediction modes of adjacent blocks from all 9 modes by 3 bits of rem_prev_Intra4 × 4_pred_mode_flag. This is information compression using a high correlation of prediction modes between adjacent blocks. As a specific example, when the mode numbers of the adjacent blocks located on the upper and left sides are 4 and 8, respectively, if the mode number 4 is selected in the prediction target block, it becomes 1 bit, and if the other is selected, it becomes 4 bits. Summarizing the above, Equation 1 shows that both the similarity D between the input image and the predicted image and the header code amount R required for description of the prediction mode are considered, and the intra prediction unit 101 uses 4 × 4 blocks. The cost calculation is performed using all nine modes defined in the above, and the prediction mode having the minimum value is selected.

すなわち、ステップ３３において、イントラ予測部１０１は４×４ブロックで規定されている全９モードのコストを計算したかを確認する。計算の終了が確認された場合、ステップ３４において、イントラ予測部１０１ではコストが最小となる予測モードを選択する。以上により、ステップ２１で行う４×４ブロックのイントラ予測の処理は終了する。なお、１６×１６ブロックのイントラ予測の処理については、式１において、ヘッダ符号量Ｒを含まない歪み量Ｄのみ（Ｊ＝Ｄ）で判断を行う。これは、１６×１６ブロックで規定されている予測モードは４モードあり、いずれの予測モードを選択しても２ビットで記述することができるからである。上記の点を除けば４×４ブロックと同様であり、イントラ予測部１０１は１６×１６ブロックで規定されている全４モードのコストを計算し、コストが最小となる予測モードを選択する。以上がＨ．２６４／ＡＶＣイントラ符号化の概要である。 That is, in step 33, the intra prediction unit 101 confirms whether the costs for all nine modes defined by 4 × 4 blocks have been calculated. When the end of the calculation is confirmed, in step 34, the intra prediction unit 101 selects a prediction mode that minimizes the cost. Thus, the 4 × 4 block intra prediction process performed in step 21 is completed. Note that the 16 × 16 block intra prediction process is determined using only the distortion amount D that does not include the header code amount R (J = D) in Equation 1. This is because there are four prediction modes defined by 16 × 16 blocks, and even if any prediction mode is selected, it can be described by 2 bits. Except for the above points, the block is the same as the 4 × 4 block, and the intra prediction unit 101 calculates the cost of all four modes defined by the 16 × 16 block, and selects the prediction mode that minimizes the cost. The above is H. 2 is an overview of H.264 / AVC intra coding.

しかしながら、低ビットレートの場合、式１を用いて予測モードを決定すると、入力画像（図１２参照）には無い水平・垂直方向の線状ノイズが復号画像（図１３参照）に発生することが非特許文献５で指摘されている。
谷田他，”Ｈ．２６４／ＭＰＥＧ−４ＡＶＣにおける線状ノイズ抑制アルゴリズムの検証”，電子情報通信学会総合大会、Ｄ−１１−３８，２００４線状ノイズの発生原因は二つ考えられる。一つには、式１のコスト関数に存在する。式１では、量子化パラメータＱＰの大きい低ビットレートの場合、ヘッダ符号量Ｒの前の係数が大きくなり、入力画像と予測画像の類似度を示す歪み量Ｄをコストとして重視していない。すなわち、式１は予測画像が入力画像と大きく異なっていても、ヘッダ符号量の小さい予測モードを優先して選択することになる。つまり、上記した通り、４×４ブロックのイントラ予測において、隣接ブロックのモード番号が小さい予測モードを選択した場合にはヘッダ符号量は１ビットになり、その他の予測モードを選択した場合にはヘッダ符号量は３ビットになる。量子化パラメータＱＰの大きい低ビットレートの場合、入力画像と予測画像の類似度を示す歪み量Ｄがコストとして重視されず、隣接ブロックのモード番号が小さい予測モードを選択する頻度が高くなる。 However, in the case of a low bit rate, when the prediction mode is determined using Equation 1, horizontal and vertical linear noise that does not exist in the input image (see FIG. 12) may occur in the decoded image (see FIG. 13). Non-patent document 5 points out.
Yada et al., “Verification of linear noise suppression algorithm in H.264 / MPEG-4 AVC”, IEICE General Conference, D-11-38, 2004 There are two possible causes of linear noise. One is in the cost function of Equation 1. In Equation 1, in the case of a low bit rate with a large quantization parameter QP, the coefficient before the header code amount R becomes large, and the distortion amount D indicating the similarity between the input image and the predicted image is not emphasized as a cost. That is, Formula 1 selects a prediction mode with a small header code amount with priority even if the prediction image is significantly different from the input image. That is, as described above, in 4 × 4 block intra prediction, when a prediction mode with a small mode number of an adjacent block is selected, the header code amount is 1 bit, and when other prediction modes are selected, a header is used. The code amount is 3 bits. In the case of a low bit rate with a large quantization parameter QP, the distortion amount D indicating the similarity between the input image and the predicted image is not emphasized as a cost, and the frequency of selecting a prediction mode with a small mode number of an adjacent block increases.

図１４にヘッダ符号量の小さい予測モードを選択した場合の例を示す。Ｏは入力画像、Ｐは予測画像、Ｄは差分画像、ＸとＹは隣接ブロック、ＭＸとＭＹは隣接ブロックＸとＹの予測モードをそれぞれ示している。図において、予測画像Ｐは隣接ブロックＹの予測モードＭＹと同じ垂直予測モードを選択して生成されている。ところが、図に示すように、予測画像Ｐの左から２列目（Ｒ１）は周囲と離れた画素値８０を持っている。周囲と著しく異なる画素値を持った予測画像を用いると、復号画像に線状ノイズが発生してしまうという問題がある。 FIG. 14 shows an example when a prediction mode with a small header code amount is selected. O is an input image, P is a predicted image, D is a difference image, X and Y are adjacent blocks, and MX and MY are prediction modes of adjacent blocks X and Y, respectively. In the figure, the predicted image P is generated by selecting the same vertical prediction mode as the prediction mode MY of the adjacent block Y. However, as shown in the figure, the second column (R1) from the left of the predicted image P has a pixel value 80 that is distant from the surroundings. When a predicted image having pixel values that are significantly different from the surroundings is used, there is a problem that linear noise occurs in the decoded image.

すなわち、イントラ予測部１０１では、ヘッダ符号量を加味した式１のコスト関数を用いると、ヘッダ符号量が小さくなる隣接ブロックのモード番号が小さい予測モードを用いることでコスト関数が最小となり、線状ノイズが発生するおそれがあるにもかかわらず、結果として当該予測モードを選択することになる。これが、線状ノイズ発生原因の１つとして考えられる。 That is, in the intra prediction unit 101, when the cost function of Equation 1 with the header code amount taken into account is used, the cost function is minimized by using the prediction mode with a small mode number of an adjacent block in which the header code amount is small. The prediction mode is selected as a result although there is a possibility that noise is generated. This is considered as one of the causes of the generation of linear noise.

もう１つは量子化処理に関係する。復号画像のモデル式は式５で表される。式５において、Ｏは入力画像、Ｐは予測画像、Ｗは直交変換／量子化、Ｒは復号画像である。式５において、Ｏ−Ｐに相当する差分画像は、量子化ステップが大きいと、テクスチャ情報が削られる。 The other is related to quantization processing. The model expression of the decoded image is expressed by Expression 5. In Equation 5, O is an input image, P is a predicted image, W is orthogonal transform / quantization, and R is a decoded image. In Expression 5, when the difference image corresponding to OP is large in the quantization step, the texture information is deleted.

そのため、図１４の差分画像Ｄは、量子化ステップの大きい低ビットレートの場合、左から２列目（Ｒ２）のテクスチャ情報を失い、予測画像Ｐ上の垂直方向の線がそのまま復号画像に反映され、結果として、線状ノイズが発生してしまう。つまり、式１を用いて、線状ノイズが発生するおそれがあるにもかかわらず選択された予測モードを用いて生成される予測画像は、そのまま復号画像に反映されてしまい、結果として線状ノイズが発生してしまう。

Therefore, the difference image D in FIG. 14 loses texture information in the second column (R2) from the left in the case of a low bit rate with a large quantization step, and the vertical line on the predicted image P is directly reflected in the decoded image. As a result, linear noise is generated. In other words, the prediction image generated using the selected prediction mode in spite of the possibility of occurrence of linear noise using Equation 1, is reflected in the decoded image as it is, and as a result, linear noise is generated. Will occur.

本発明は、こうした問題に着目し、線状ノイズが発生するおそれがある予測モードを選択せず、復号画像に線状ノイズの発生を抑制することができる画像符号化装置及び画像符号化プログラムを提供することを目的とする。 The present invention pays attention to such a problem, and provides an image encoding device and an image encoding program capable of suppressing the generation of linear noise in a decoded image without selecting a prediction mode in which linear noise may occur. The purpose is to provide.

本発明の請求項１に記載の発明は、入力画像をブロック単位に分割し、イントラ予測を行って選択される予測モードに従って生成される予測画像との差分を符号化する画像符号化装置であって、符号化対象ブロックに隣接するブロックの予測モードを参照する隣接予測モード参照部と、参照された隣接予測モードに応じて所定のコスト関数に切替える命令を出力するコスト関数切替え部と、前記コスト関数切替え部から出力されたコスト関数切替え命令に従って所定のコスト関数に切り替え、当該コスト関数を用いた計算処理の結果に基づいて選択した予測モードを用いて予測画像を生成するイントラ予測部とを備えることを特徴とする画像符号化装置である。これにより、イントラ予測を行う際に参照する隣接したブロックの予測モードに基づいて、所定のコスト関数に切替えられ、予測画像に線状ノイズが発生するおそれがある予測モードの選択を防止でき、復号画像に線状ノイズが発生するのを抑制できる。 The invention according to claim 1 of the present invention is an image encoding apparatus that divides an input image into blocks and encodes a difference from a predicted image generated according to a prediction mode selected by performing intra prediction. An adjacent prediction mode reference unit that refers to a prediction mode of a block adjacent to the encoding target block, a cost function switching unit that outputs a command to switch to a predetermined cost function according to the referenced adjacent prediction mode, and the cost An intra prediction unit that switches to a predetermined cost function according to a cost function switching command output from the function switching unit and generates a predicted image using a prediction mode selected based on a result of a calculation process using the cost function. An image encoding device characterized by the above. Thereby, based on the prediction mode of the adjacent block referred when performing intra prediction, it can switch to a predetermined | prescribed cost function and can prevent selection of the prediction mode which may generate a linear noise in a prediction image, and decoding Generation of linear noise in the image can be suppressed.

本発明の請求項２に記載の発明は、請求項１に記載の画像符号化装置について、前記コスト関数切替え部は、前記参照された予測モードが符号化対象ブロックに隣接する画素の画素値を所定の方向に外挿補間して予測画像を生成するモードである場合、入力画像と予測画像の類似度を示す歪み量のみで構成されるコスト関数に切替える命令を出力し、それ以外の場合、予測モードをヘッダに記述する際に要するヘッダ符号量と前記類似度とを含むコスト関数に切替える命令を出力することを特徴とする。これにより、予測画像に線状ノイズが発生するおそれがある予測モードを選択した場合にはヘッダ符号量を考慮しないコスト関数を用い、また、それ以外の場合にはヘッダ符号量を考慮したコスト関数を用いることにより、線状ノイズの発生を抑制するとともに高い符号化効率を実現できる。 According to a second aspect of the present invention, in the image coding device according to the first aspect, the cost function switching unit is configured to calculate a pixel value of a pixel whose reference prediction mode is adjacent to the encoding target block. When it is a mode for generating a predicted image by extrapolating in a predetermined direction, it outputs a command to switch to a cost function consisting only of the distortion amount indicating the similarity between the input image and the predicted image, otherwise An instruction for switching to a cost function including a header code amount required for describing a prediction mode in a header and the similarity is output. As a result, a cost function that does not consider the header code amount is used when a prediction mode that may cause linear noise in the predicted image is selected, and a cost function that considers the header code amount otherwise. By using, it is possible to suppress the generation of linear noise and achieve high coding efficiency.

本発明の請求項３に記載の発明は、請求項２記載の画像符号化装置について、前記所定の方向は、水平または垂直方向であることを特徴とする。これにより、統計的に発生しやすい水平または垂直方向の線状ノイズを抑制できる。 According to a third aspect of the present invention, in the image encoding device according to the second aspect, the predetermined direction is a horizontal or vertical direction. Thereby, the linear noise of the horizontal or vertical direction which is easy to generate statistically can be suppressed.

本発明の請求項４に記載の発明は、請求項１記載の画像符号化装置について、前記隣接予測モード参照部は、前記符号化対象ブロックに隣接する、上または左に位置するブロックの予測モードを参照することを特徴とする。これにより、Ｈ．２６４／ＡＶＣの規格内で実現することができる。 According to a fourth aspect of the present invention, in the image coding device according to the first aspect, the adjacent prediction mode reference unit predicts a prediction mode of a block located adjacent to the encoding target block and located above or to the left. It is characterized by referring to. As a result, H.C. It can be realized within the H.264 / AVC standard.

本発明の請求項５に記載の発明は、請求項４記載の画像符号化装置について、前記隣接予測モード参照部は、前記上または左に位置するブロックのうちモード番号が小さい予測モードの方を参照することを特徴とする。これにより、Ｈ．２６４／ＡＶＣの規格内で実現することができる。 According to a fifth aspect of the present invention, in the image coding device according to the fourth aspect, the adjacent prediction mode reference unit selects a prediction mode having a smaller mode number from among the blocks located above or to the left. It is characterized by reference. As a result, H.C. It can be realized within the H.264 / AVC standard.

本発明の請求項６に記載の発明は、コンピュータが実行することにより、符号化対象ブロックに隣接するブロックの予測モードを参照する隣接予測モード参照部、参照された隣接予測モードに応じて所定のコスト関数に切替える命令を出力するコスト関数切替え部、前記コスト関数切替え部から出力されたコスト関数切替え命令に従って所定のコスト関数に切り替え、当該コスト関数を用いた計算処理の結果に基づいて選択した予測モードを用いて予測画像を生成するイントラ予測部、として機能することを特徴とする画像符号化プログラムである。このプログラムを画像符号化装置のコンピュータに実行させることにより、イントラ予測を行う際に参照する隣接したブロックの予測モードに基づいて、所定のコスト関数に切替えられ、予測画像に線状ノイズが発生するおそれがある予測モードの選択を防止でき、復号画像に線状ノイズが発生するのを抑制できる。 The invention according to claim 6 of the present invention is executed by a computer, so that an adjacent prediction mode reference unit that refers to a prediction mode of a block adjacent to an encoding target block, and a predetermined number according to the referred adjacent prediction mode. A cost function switching unit that outputs an instruction to switch to a cost function, a prediction that is selected based on a result of a calculation process using the cost function by switching to a predetermined cost function according to the cost function switching command output from the cost function switching unit An image encoding program that functions as an intra prediction unit that generates a predicted image using a mode. By causing the computer of the image encoding apparatus to execute this program, switching to a predetermined cost function is performed based on the prediction mode of an adjacent block referred to when performing intra prediction, and linear noise is generated in the predicted image. It is possible to prevent the selection of a predictive mode that may cause the occurrence of linear noise in the decoded image.

上記の通り、本発明の画像符号化装置及び画像符号化プログラムによれば、イントラ予測を行う際に参照する、隣接したブロックの予測モードに基づいて、所定のコスト関数に切替えられ、予測画像に線状ノイズが発生するおそれがある予測モードの選択を防止でき、復号画像に線状ノイズが発生するのを抑制できる。 As described above, according to the image encoding device and the image encoding program of the present invention, a prediction function is switched to a predetermined cost function based on a prediction mode of an adjacent block that is referred to when performing intra prediction. Selection of a prediction mode in which linear noise may occur can be prevented, and generation of linear noise in the decoded image can be suppressed.

以下、本発明を実施するための最良の形態について説明する。なお、本発明は以下の実施例で説明する範囲に限定されるものではなく、その要旨を逸脱しない範囲であれば、適宜に変更、実施できるものである。 Hereinafter, the best mode for carrying out the present invention will be described. In addition, this invention is not limited to the range demonstrated by the following Example, In the range which does not deviate from the summary, it can change and implement suitably.

以下、本発明の実施例について説明する。図１は、本発明の画像符号化プログラムを搭載したＨ．２６４／ＡＶＣイントラ符号化を実現する符号化装置のブロック図である。図１において、１０１はイントラ予測部、１０２は直交変換部、１０３は量子化部、１０４は可逆符号化部、１０５はバッファメモリ、１０６はレート制御部、１０７は逆量子化部、１０８は逆直交変換部、１０９はフレームメモリ、１１０は差分器、１１１は加算器、４０１は隣接予測モード参照部、４０２はコスト関数切替え部である。なお、本発明の主要となるブロックは、コスト関数切替え部４０１、隣接予測モード参照部４０２、及びイントラ予測部１０１であり、図３に示す従来の符号化装置と同一の構成には同一の符号を付している。 Examples of the present invention will be described below. FIG. 1 shows an H.264 image coding program according to the present invention. 1 is a block diagram of an encoding device that realizes H.264 / AVC intra coding. In FIG. 1, 101 is an intra prediction unit, 102 is an orthogonal transform unit, 103 is a quantization unit, 104 is a lossless encoding unit, 105 is a buffer memory, 106 is a rate control unit, 107 is an inverse quantization unit, and 108 is an inverse unit. An orthogonal transform unit, 109 is a frame memory, 110 is a differentiator, 111 is an adder, 401 is an adjacent prediction mode reference unit, and 402 is a cost function switching unit. The main blocks of the present invention are a cost function switching unit 401, an adjacent prediction mode reference unit 402, and an intra prediction unit 101. The same reference numerals are used for the same components as those of the conventional coding apparatus shown in FIG. Is attached.

はじめに、Ｈ．２６４／ＡＶＣイントラ符号化を実現する画像符号化装置の各処理機能に関して順に説明する。上記した通り、イントラ予測部１０１では、フレームメモリ１０９に保存されている復号画像を入力してイントラ予測を行う。隣接画素間の高い相関を利用した処理を行い、入力される復号画像を用いて、Ｈ．２６４／ＡＶＣで規定されている複数の予測モードから１つのモードを選択し、予測画像と呼ばれる画像を生成する。生成された予測画像は、予測モードと共に差分器１１０と加算器１１１へ出力される。直交変換部１０２では、差分器１１０より出力される入力画像と予測画像の差分画像を入力し、当該差分画像に対して離散コサイン変換またはカーネル・レーベ変換等の直交変換を施す。直交変換された差分画像は、変換係数として量子化部１０３へ出力される。量子化部１０３では、直交変換部１０２より出力される直交変換後の変換係数とレート制御部１０６より出力される量子化パラメータを入力し、当該変換係数に対して量子化を施す。量子化された直交変換係数は、符号化変換係数として可逆符号化部１０４と逆量子化部１０７へ出力される。可逆符号化部１０４では、量子化部１０３より出力される符号化変換係数を入力し、当該変換係数に対して可変長符号化もしくは算術符号化に基づく可逆符号化を施し、符号語を生成する。生成された符号語は、バッファメモリ１０５へ出力される。バッファメモリ１０５では、可逆符号化部１０４より出力される符号語を入力し、当該符号語を保存する。保存された符号語は、レート制御部１０６で参照された後、符号化画像として出力される。レート制御部１０６では、バッファメモリ１０５のバッファ占有量を参考にして圧縮率を決定する。決定された圧縮率は、量子化パラメータに換算され、量子化部１０３へ出力される。逆量子化部１０７では、量子化部１０３より出力される符号化変換係数を入力し、当該変換係数に対して逆量子化を施す。逆量子化された符号化変換係数は、復号変換係数として逆直交変換部１０８へ出力される。逆直交変換部１０８では、逆量子部１０７より出力される復号変換係数を入力し、当該変換係数に対して逆直交変換を施す。逆直交変換された復号変換係数は、復号差分画像として加算器１１１へ出力される。フレームメモリ１０９では、イントラ予測部１０１より出力される予測画像と逆直交変換部１０８より出力される復号差分画像の加算画像を入力し、当該加算画像を復号画像として保存する。保存された復号画像は、イントラ予測部１０１でイントラ予測する際に参照される。 First, H.C. Each processing function of the image coding apparatus that implements H.264 / AVC intra coding will be described in order. As described above, the intra prediction unit 101 inputs a decoded image stored in the frame memory 109 and performs intra prediction. A process that uses a high correlation between adjacent pixels is performed, and an H. One mode is selected from a plurality of prediction modes defined by H.264 / AVC, and an image called a predicted image is generated. The generated prediction image is output to the differentiator 110 and the adder 111 together with the prediction mode. In the orthogonal transform unit 102, the difference image between the input image and the predicted image output from the differentiator 110 is input, and the difference image is subjected to orthogonal transformation such as discrete cosine transformation or kernel-label transformation. The orthogonally transformed difference image is output to the quantization unit 103 as a transform coefficient. The quantization unit 103 receives the transform coefficient after the orthogonal transform output from the orthogonal transform unit 102 and the quantization parameter output from the rate control unit 106, and performs quantization on the transform coefficient. The quantized orthogonal transform coefficient is output to the lossless encoding unit 104 and the inverse quantization unit 107 as an encoded transform coefficient. The lossless encoding unit 104 receives the encoded transform coefficient output from the quantization unit 103, performs lossless encoding based on variable length coding or arithmetic coding on the transform coefficient, and generates a codeword. . The generated code word is output to the buffer memory 105. In the buffer memory 105, the code word output from the lossless encoding unit 104 is input, and the code word is stored. The stored codeword is referred to by the rate control unit 106 and then output as an encoded image. The rate control unit 106 determines the compression rate with reference to the buffer occupation amount of the buffer memory 105. The determined compression rate is converted into a quantization parameter and output to the quantization unit 103. The inverse quantization unit 107 receives the encoded transform coefficient output from the quantization unit 103, and performs inverse quantization on the transform coefficient. The inversely quantized encoded transform coefficient is output to the inverse orthogonal transform unit 108 as a decoded transform coefficient. The inverse orthogonal transform unit 108 receives the decoded transform coefficient output from the inverse quantum unit 107 and performs inverse orthogonal transform on the transform coefficient. The inverse transform transform decoded transform coefficient is output to the adder 111 as a decoded difference image. In the frame memory 109, the prediction image output from the intra prediction unit 101 and the addition image of the decoded difference image output from the inverse orthogonal transform unit 108 are input, and the addition image is stored as a decoding image. The stored decoded image is referred to when the intra prediction unit 101 performs intra prediction.

隣接予測モード参照部４０１は、イントラ予測部１０１に保存されている復号画像の予測モードを入力し、符号化対象ブロックに隣接するブロックで既に選択された隣接するブロックの予測モードを参照する。符号化対象ブロックは予測モードを決定する参照対象となるブロックであり、符号化対象ブロックの上または左に位置する隣接ブロックのモード番号の内、小さい方の予測モードを参照する。参照された隣接予測モードは、コスト関数切替え部４０２へ出力される。 The adjacent prediction mode reference unit 401 inputs the prediction mode of the decoded image stored in the intra prediction unit 101, and refers to the prediction mode of the adjacent block that has already been selected in the block adjacent to the encoding target block. The encoding target block is a block that is a reference target for determining a prediction mode, and refers to the smaller prediction mode among the mode numbers of adjacent blocks located above or to the left of the encoding target block. The referenced adjacent prediction mode is output to the cost function switching unit 402.

コスト関数切替え部４０２は、隣接予測モード参照部４０１より出力される隣接予測モードを入力し、当該参照された隣接予測モードに応じて所定のコスト関数に切替える。具体的には、隣接予測モードが水平・垂直予測モードの場合、歪み量Ｄのみの式４のコスト関数に切替える。また、それ以外の場合、ヘッダ符号量を加味した式１のコスト関数に切替える。コスト関数の切替えは、コスト関数切替え命令としてイントラ予測部１０１に出力される。
＜隣接予測モードが水平・垂直予測モードである場合＞ The cost function switching unit 402 receives the adjacent prediction mode output from the adjacent prediction mode reference unit 401, and switches to a predetermined cost function according to the referenced adjacent prediction mode. Specifically, when the adjacent prediction mode is the horizontal / vertical prediction mode, the cost function of Expression 4 with only the distortion amount D is switched. In other cases, the cost function is changed to the cost function of Equation 1 in consideration of the header code amount. The cost function switching is output to the intra prediction unit 101 as a cost function switching command.
<When adjacent prediction mode is horizontal / vertical prediction mode>

＜隣接予測モードが水平・垂直予測モードでない場合＞

図２は、コスト関数の計算処理を示すフロー図である。以下の処理フローは、４×４ブロック単位に処理が施される。はじめに、ステップ５１において、隣接予測モード参照部４０１は、符号化対象ブロックの上または左に位置する隣接ブロックの予測モードを参照する。具体的には、上または左の隣接ブロックのモード番号の内、小さい方の予測モードを参照する。参照された隣接予測モードは、コスト関数切替え部４０２へ出力される。

FIG. 2 is a flowchart showing a cost function calculation process. In the following processing flow, processing is performed in units of 4 × 4 blocks. First, in step 51, the adjacent prediction mode reference unit 401 refers to the prediction mode of the adjacent block located above or to the left of the encoding target block. Specifically, the smaller prediction mode is referred to among the mode numbers of the upper and left adjacent blocks. The referenced adjacent prediction mode is output to the cost function switching unit 402.

次に、ステップ５２において、コスト関数切替え部４０２は、隣接予測モード参照部４０１より出力される隣接予測モードを入力し、当該参照された隣接予測モードが符号化対象ブロックに隣接する画素の画素値を所定の方向に外挿補間して予測画像を生成するモードであるか否かを判断する。本実施例では、統計的に発生しやすい水平または垂直予測モードであるかを判断する。 Next, in step 52, the cost function switching unit 402 receives the adjacent prediction mode output from the adjacent prediction mode reference unit 401, and the pixel value of the pixel in which the referenced adjacent prediction mode is adjacent to the encoding target block. Is extrapolated in a predetermined direction to determine whether or not it is a mode for generating a predicted image. In the present embodiment, it is determined whether the horizontal or vertical prediction mode is statistically likely to occur.

隣接予測モードが水平・垂直予測モードの場合、ステップ５３において、コスト関数切替え部４０２は、歪み量Ｄのみで定義される上記式４のコスト関数に切替える。これは、低ビットレートの場合、隣接予測モードが水平・垂直予測モードであると、イントラ予測部１０１は、ヘッダ符号量が小さくなる水平・垂直予測モードを優先して選択し、結果として線状ノイズを生じさせることになるためである。例えば、従来技術の説明において示した、図１４において隣接ブロックＹの予測モードＭＹと同じ垂直予測モードを選択した場合、垂直予測モードであると判断し、式４を用いる。この場合、式４のように、コスト関数が歪み量Ｄのみで定義されていれば、入力画像Ｏと予測画像Ｐの類似度は低いと判断し、結果として、従来では選択されていた垂直予測モードを用いて予測画像を生成しない。これにより、従来では垂直予測モードを用いて生成した予測画像に線状ノイズが発生していたが、本実施例では、予測画像に線状ノイズが発生するおそれがある予測モードを選択した場合にはヘッダ符号量を考慮しないコスト関数を用い、当該予測モードの選択を防止でき、復号画像に線状ノイズが発生するのを抑制できる。 When the adjacent prediction mode is the horizontal / vertical prediction mode, in step 53, the cost function switching unit 402 switches to the cost function of Equation 4 defined only by the distortion amount D. In the case of a low bit rate, when the adjacent prediction mode is the horizontal / vertical prediction mode, the intra prediction unit 101 preferentially selects the horizontal / vertical prediction mode in which the header code amount is small, and as a result, linear This is because noise is generated. For example, when the same vertical prediction mode as the prediction mode MY of the adjacent block Y shown in FIG. 14 shown in the description of the prior art is selected, the vertical prediction mode is determined and Expression 4 is used. In this case, if the cost function is defined only by the distortion amount D as shown in Expression 4, it is determined that the similarity between the input image O and the predicted image P is low, and as a result, the vertical prediction that has been selected in the past is determined. A prediction image is not generated using the mode. As a result, linear noise has conventionally occurred in the prediction image generated using the vertical prediction mode, but in this embodiment, when a prediction mode that may cause linear noise in the prediction image is selected. Uses a cost function that does not consider the amount of header codes, can prevent the prediction mode from being selected, and can suppress the occurrence of linear noise in the decoded image.

一方、隣接予測モード情報が水平・垂直予測モード以外の場合、ステップ５４において、コスト関数切替え部４０２は、ヘッダ符号量を加味した上記式１のコスト関数に切替える。これにより、予測画像に線状ノイズが発生するおそれがない予測モードを選択した場合にはヘッダ符号量を考慮したコスト関数を用い、線状ノイズの発生を抑制するとともに高い符号化効率を実現できる。 On the other hand, when the adjacent prediction mode information is other than the horizontal / vertical prediction mode, in step 54, the cost function switching unit 402 switches to the cost function of the above formula 1 in consideration of the header code amount. As a result, when a prediction mode that does not cause the occurrence of linear noise in the predicted image is selected, a cost function that considers the header code amount is used to suppress the generation of linear noise and achieve high coding efficiency. .

最後に、ステップ５４において、イントラ予測部１０１は、切替え後のコスト関数に基づき、コストを計算する。そして、イントラ予測部１０１では４×４ブロックで規定されている全９モードを用いてコスト計算を行い、その中で値が最小となる予測モードを選択し、これを用いて予測画像を生成する。 Finally, in step 54, the intra prediction unit 101 calculates a cost based on the cost function after switching. Then, the intra prediction unit 101 performs cost calculation using all nine modes defined by 4 × 4 blocks, selects a prediction mode having a minimum value among them, and generates a prediction image using the prediction mode. .

以上、本実施例によれば、イントラ予測を行う際に参照する、隣接したブロックの予測モードに基づいて、所定のコスト関数に切替えられ、予測画像に線状ノイズが発生するおそれがある予測モードの選択を防止でき、復号画像に線状ノイズが発生するのを抑制できる。 As described above, according to the present embodiment, a prediction mode that is switched to a predetermined cost function based on a prediction mode of an adjacent block that is referred to when performing intra prediction, and that linear noise may occur in the prediction image. Selection can be prevented, and the occurrence of linear noise in the decoded image can be suppressed.

本発明の画像符号化装置は、静止圧縮画像を高画質に実現可能であるため、ランダムアクセス性や編集性に優れている。それ故、画像検索機能を有する画像蓄積装置または画像編集装置等への適用が有用である。 The image coding apparatus of the present invention can realize a still compressed image with high image quality, and thus has excellent random accessibility and editability. Therefore, application to an image storage device or an image editing device having an image search function is useful.

本発明の画像符号化装置の全体構成を示すブロック図The block diagram which shows the whole structure of the image coding apparatus of this invention 本発明のコスト関数の計算処理を示すフロー図The flowchart which shows the calculation process of the cost function of this invention 従来の画像符号化装置の全体構成を示すブロック図The block diagram which shows the whole structure of the conventional image coding apparatus. ４×４ブロックの予測モード番号とその方向を示す図The figure which shows the prediction mode number and its direction of 4x4 block １６×１６ブロックの予測モード番号とその方向を示す図The figure which shows the prediction mode number and its direction of 16x16 block イントラ予測部が行う処理を示すフロー図Flow chart showing processing performed by intra prediction unit ４×４ブロックのイントラ予測の処理を示すフロー図Flow chart showing 4 × 4 block intra prediction processing 符号化対象ブロックとそれに隣接するブロックを示す図The figure which shows the encoding object block and the block adjacent to it 垂直予測モードを選択した場合に生成される予測画像を示す図The figure which shows the prediction image produced | generated when the vertical prediction mode is selected 水平予測モードを選択した場合に生成される予測画像を示す図The figure which shows the prediction image produced | generated when horizontal prediction mode is selected Ｈ．２６４／ＡＶＣで規定されているヘッダのシンタックスを示す図H. The figure which shows the syntax of the header prescribed | regulated by H.264 / AVC 入力画像（線状ノイズは発生していない）を示す図Diagram showing input image (no linear noise is generated) 復号画像（線状ノイズが発生する）を示す図Diagram showing decoded image (linear noise occurs) ヘッダ符号量の小さい予測モードを選択した場合の例を示す図The figure which shows the example at the time of selecting the prediction mode with small header code amount

Explanation of symbols

１０１イントラ予測部
４０１隣接予測モード参照部
４０２コスト関数切替え部 101 Intra Prediction Unit 401 Adjacent Prediction Mode Reference Unit 402 Cost Function Switching Unit

Claims

An image encoding device that divides an input image into blocks and encodes a difference from a prediction image generated according to a prediction mode selected by performing intra prediction,
An adjacent prediction mode reference unit that refers to a prediction mode of a block adjacent to the encoding target block;
A cost function switching unit that outputs a command to switch to a predetermined cost function according to the referenced adjacent prediction mode;
An intra prediction unit that switches to a predetermined cost function according to a cost function switching command output from the cost function switching unit and generates a prediction image using a prediction mode selected based on a result of a calculation process using the cost function; ,
An image encoding device comprising:

The cost function switching unit, when the referenced prediction mode is a mode for generating a predicted image by extrapolating pixel values of pixels adjacent to the encoding target block in a predetermined direction, an input image and a predicted image Outputs a command to switch to a cost function consisting only of the distortion amount indicating the similarity of
2. The image encoding apparatus according to claim 1, wherein in other cases, an instruction for switching to a cost function including a header code amount required for describing a prediction mode in a header and the similarity is output.

3. The image encoding apparatus according to claim 2, wherein the predetermined direction is a horizontal or vertical direction.

The image coding apparatus according to claim 1, wherein the adjacent prediction mode reference unit refers to a prediction mode of a block adjacent to the encoding target block and positioned on the upper or left side.

The image coding apparatus according to claim 4, wherein the adjacent prediction mode reference unit refers to a prediction mode having a smaller mode number among the blocks located above or to the left.

By running the computer,
An adjacent prediction mode reference unit that refers to a prediction mode of a block adjacent to the encoding target block;
A cost function switching unit that outputs a command to switch to a predetermined cost function according to the referenced adjacent prediction mode;
And an intra prediction unit that switches to a predetermined cost function according to a cost function switching command output from the cost function switching unit and generates a predicted image using a prediction mode selected based on a result of calculation processing using the cost function An image encoding program that functions as: