[go: up one dir, main page]

HK1208294B - Image decoding device, image encoding device and image decoding method - Google Patents

Image decoding device, image encoding device and image decoding method Download PDF

Info

Publication number
HK1208294B
HK1208294B HK15108848.0A HK15108848A HK1208294B HK 1208294 B HK1208294 B HK 1208294B HK 15108848 A HK15108848 A HK 15108848A HK 1208294 B HK1208294 B HK 1208294B
Authority
HK
Hong Kong
Prior art keywords
vector
unit
image
prediction
layer
Prior art date
Application number
HK15108848.0A
Other languages
Chinese (zh)
Other versions
HK1208294A1 (en
Inventor
Tomohiro Ikai
Tadashi Uchiumi
Yoshiya Yamamoto
Original Assignee
夏普株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 夏普株式会社 filed Critical 夏普株式会社
Priority claimed from PCT/JP2013/076016 external-priority patent/WO2014050948A1/en
Publication of HK1208294A1 publication Critical patent/HK1208294A1/en
Publication of HK1208294B publication Critical patent/HK1208294B/en

Links

Description

Image decoding device, image encoding device, and image decoding method
Technical Field
The present invention relates to an image decoding device and an image encoding device.
The present application claims priority based on Japanese application laid-open No. 2012-217904 at 9/28/2012, the contents of which are incorporated herein by reference.
Background
In the technique of encoding images from a plurality of viewpoints, there have been proposed parallax prediction encoding for reducing the amount of information by predicting the parallax between images when encoding images from a plurality of viewpoints, and a decoding method corresponding to the encoding method (for example, non-patent document 1). A vector representing the disparity between viewpoint images is referred to as a disparity vector. The disparity vector is a two-dimensional vector having an element value (x component) in the horizontal direction and an element value (y component) in the vertical direction, and is calculated for each block that is a region into which 1 image is divided. In order to acquire images from a plurality of viewpoints, cameras arranged at the respective viewpoints are generally used. In the coding of multiple views, each view image is coded as a different layer in each of multiple layers. A coding method in which each layer is further composed of a plurality of layers is generally called Scalable (Scalable) coding or hierarchical coding. In scalable coding, high coding efficiency is achieved by performing prediction between layers. A layer that is not predicted between layers and serves as a reference is referred to as a base layer, and other layers are referred to as extension layers. Scalable encoding when a layer is composed of view images is referred to as view scalable encoding. At this time, the base layer is also referred to as a base view, and the extension layer is also referred to as a non-base view.
In addition to view scalable coding, scalable coding includes spatial scalability (processing a picture having a low resolution as a base layer and processing a picture having a high resolution as an extension layer), SNR scalable coding (processing a picture having a low quality as a base layer and processing a picture having a high resolution as an extension layer), and the like. In scalable coding, for example, a picture of a base layer is used as a reference picture in coding a picture of an extension layer.
Documents of the prior art
Non-patent document
Non-patent document 1: "High efficiency video coding draft 8", JCTVC-J10003, Stockholm, SE, July, 2012
Disclosure of Invention
Problems to be solved by the invention
However, when the cameras are arranged one-dimensionally, parallax in a direction orthogonal to the arrangement may not occur. For example, when 2 cameras are arranged in the horizontal direction, parallax (X component of the displacement vector) mainly occurs in the horizontal direction. In such a case, a sign (Y component of the displacement vector) generated by encoding the parallax in the direction orthogonal to the arrangement of the cameras is redundant. In addition, in the case of spatial scalable coding or SNR scalable coding, since disparity does not occur between the base layer and the extension layer, symbols generated by coding such disparity (displacement vector) are redundant.
The present invention has been made in view of the above-described points, and provides an image decoding device and an image encoding device that can improve encoding efficiency.
Means for solving the problems
(1) The present invention has been made to solve the above-described problems, and an image decoding device according to an aspect of the present invention may include: a prediction parameter derivation unit that derives a prediction parameter of the target prediction block; and a prediction image generation unit configured to read a reference image of a region indicated by the vector derived by the prediction parameter derivation unit, and generate a prediction image based on the read reference image, wherein the prediction parameter derivation unit includes an external-layer reference address conversion unit configured to convert coordinates of a prediction parameter of a reference block when a target image belonging to the target prediction block and a reference image belonging to a reference block that is a part of the reference image belong to different layers.
(2) The operation of the outer layer reference address conversion unit in the image decoding device according to the aspect of the present invention to convert the coordinates may include an operation of discretizing the coordinates in large units.
(3) In addition, the operation of the outer layer reference address conversion unit in the image decoding device according to the embodiment of the present invention to convert the coordinates may be an operation of shifting the X coordinate and the Y coordinate by 3 bits to the right and then by 3 bits to the left.
(4) In addition, the present invention has been made to solve the above-mentioned problems, and an image encoding device according to an aspect of the present invention may include: a prediction parameter derivation unit that derives a prediction parameter or a prediction vector of a target prediction block; and a prediction image generation unit configured to read a reference image of the region indicated by the vector derived by the prediction parameter derivation unit, and generate a prediction image based on the read reference image, wherein the prediction parameter derivation unit includes: a prediction parameter reference unit for referring to the stored prediction parameters; and an external-layer reference address conversion unit that converts the coordinates of the prediction parameters of the reference block when the target picture belonging to the target prediction block and the reference picture belonging to the reference block that is a part of the reference picture belong to different layers.
(5) Furthermore, an image decoding device according to another aspect of the present invention may include: a vector difference decoding unit which derives a context of the arithmetic symbol and decodes the vector difference from the encoded data; a vector derivation unit that derives a vector of the target block from a sum of the vector of the processed block and the vector difference; and a predicted image generation unit configured to read a reference image of a region indicated by the vector of the target block generated by the vector derivation unit and generate a predicted image based on the read reference image, wherein the vector difference decoding unit is configured to assign a context based on whether or not the vector of the target block or the vector difference is prediction between different layers.
(6) In the vector difference decoding unit of the image decoding device according to the other aspect of the present invention, in the case of inter-layer prediction in which the vector of the target block or the vector difference is different, different contexts may be assigned to the syntax element constituting the vertical component of the vector difference and the syntax element constituting the horizontal component of the vector difference.
(7) In the vector difference decoding unit of the image decoding device according to the other aspect of the present invention, in the determination of whether the vector or the vector difference of the target block is prediction between different layers, the different contexts may be assigned to at least syntax elements constituting one component of the vector difference, when it is determined that prediction between different layers is performed and when it is not determined that prediction between different layers is performed.
(8) The syntax element of the image decoding device according to one embodiment of the present invention may be information indicating whether or not the absolute value of the vector difference exceeds 0.
(9) Furthermore, an image encoding device according to another aspect of the present invention may include: a vector difference encoding unit that derives a context of the arithmetic sign and encodes the vector difference; a vector difference derivation unit that derives the vector difference from the vector of the processed block and the vector of the target block; and a predicted image generation unit configured to read a reference image of an area indicated by the vector of the target block generated by the vector difference derivation unit and generate a predicted image based on the read reference image, wherein the vector difference derivation unit assigns a context based on whether or not the vector of the target block or the vector difference is prediction between different layers.
(10) Furthermore, an image decoding device according to another aspect of the present invention may include: a displacement vector generation unit that generates a displacement vector indicating a displacement between a first layer image and a second layer image different from the first layer image, based on a symbol indicating the displacement; a displacement vector limiting unit that limits the displacement vector to a value within a predetermined range; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit and generates a predicted image based on the read reference image.
(11) In addition, in the range in which the displacement vector restricting unit of the image decoding device according to another aspect of the present invention restricts the value of the displacement vector, the range of the vertical component may be smaller than the range of the horizontal component.
(12) The image decoding device may further include a prediction parameter deriving unit that derives, as at least a part of the prediction parameters, a prediction vector that is a prediction value of the motion vector or the displacement vector, with reference to the prediction parameters for the area indicated by the displacement vector limited by the displacement vector limiting unit in the image decoding device according to another aspect of the present invention.
(13) Furthermore, an image encoding device according to another aspect of the present invention may include: a displacement vector generation unit that generates a displacement vector indicating a part of the displacement and another part of the displacement based on a part of the displacement between a first layer image and a second layer image different from the first layer image; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit and generates a predicted image based on the read reference image.
(14) Furthermore, an image encoding device according to another aspect of the present invention may include: a displacement vector generation unit that generates a displacement vector indicating a displacement between a first layer image and a second layer image different from the first layer image, based on a symbol indicating the displacement; a displacement vector limiting unit that limits the displacement vector to a value within a predetermined range; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit and generates a predicted image based on the read reference image.
(15) Furthermore, an image decoding device according to another aspect of the present invention may include: a displacement vector generation unit that generates a displacement vector indicating a part of the displacement and another part of the displacement based on a symbol indicating a part of the displacement between a first layer image and a second layer image different from the first layer image; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit and generates a predicted image based on the read reference image.
(16) In the image decoding device according to another aspect of the present invention, the displacement vector generation unit may determine the vertical component, a predicted value of the vertical component, or a prediction residual of the vertical component to be a predetermined value.
(17) In the image decoding device according to another aspect of the present invention, the displacement vector generation unit may calculate the vertical component of the displacement based on a sign indicating a relationship between the vertical component and the horizontal direction.
Effects of the invention
According to the present invention, the memory used for generating a prediction image and for generating a prediction image is reduced, and the encoding efficiency is improved.
Drawings
Fig. 1 is a schematic diagram showing a configuration of an image transmission system according to an embodiment of the present invention.
Fig. 2 is a schematic diagram showing the configuration of the image decoding device according to the present embodiment.
Fig. 3 is a schematic diagram showing the configuration of an external prediction parameter decoding unit according to the present embodiment.
Fig. 4 is a schematic diagram showing the configuration of an external prediction parameter extraction unit according to the present embodiment.
Fig. 5 is a schematic diagram showing the configuration of the AMVP prediction parameter derivation unit according to modification D1 of the present embodiment.
Fig. 6 is a schematic diagram showing the configuration of an external prediction parameter decoding unit according to modification D3 of the present embodiment.
Fig. 7 is a schematic diagram showing the configuration of the displacement vector generator according to modification D3 of the present embodiment.
Fig. 8 is a schematic diagram showing the configuration of an external prediction parameter decoding unit according to modification D4 of the present embodiment.
Fig. 9 is a schematic diagram showing the configuration of a displacement vector limiter (clip) unit according to modification D4 of the present embodiment.
Fig. 10 is a conceptual diagram illustrating an example of a configuration in which the external predicted image generation unit reads the reference image block in modification D4 of the present embodiment.
Fig. 11 is a schematic diagram showing the configuration of the extended vector candidate derivation unit according to modification D5 of the present embodiment.
Fig. 12 is a conceptual diagram illustrating an example of a configuration for reading the prediction parameters in modification D5 of the present embodiment.
Fig. 13 is a schematic diagram showing the configuration of the merged prediction parameter deriving unit according to modification D5 of the present embodiment.
Fig. 14 is a block diagram showing the configuration of the image coding apparatus according to the present embodiment.
Fig. 15 is a schematic diagram showing the configuration of an external prediction parameter encoding unit according to the present embodiment.
Fig. 16 is a schematic diagram showing the configuration of an external prediction parameter encoding unit according to modification E3 of the present embodiment.
Fig. 17 is a conceptual diagram illustrating an example of the reference picture list.
Fig. 18 is a conceptual diagram illustrating an example of vector candidates.
Fig. 19 is a conceptual diagram showing an example of a reference image.
Fig. 20 is a conceptual diagram illustrating an example of an adjacent block.
Fig. 21 is a conceptual diagram illustrating an example of the structure of encoded data.
Fig. 22 is a conceptual diagram illustrating an example of the structure of a coded stream.
Fig. 23 is a diagram illustrating the necessity of limiting a displacement vector.
Fig. 24 is a block diagram showing the configuration of an entropy decoding unit according to this embodiment.
Fig. 25 is a block diagram showing the configuration of the entropy decoding unit 301f according to modification D6 of the present embodiment.
Fig. 26 is a diagram showing an example of the derived table information.
Fig. 27 is a diagram showing another example of deriving table information.
Fig. 28 is a diagram showing another example of deriving table information.
Fig. 29 is a diagram showing another example of deriving table information.
Fig. 30 is a diagram showing another example of deriving table information.
Fig. 31 is a diagram showing another example of deriving table information.
Fig. 32 is a schematic diagram showing the configuration of an entropy decoding unit according to modification D7 of the present embodiment.
Fig. 33 is a diagram showing an example of the derived table information according to modification D7 of the present embodiment.
Fig. 34 is a block diagram showing the configuration of an entropy encoding unit according to modification E6 of the present embodiment.
Fig. 35 is a block diagram showing the configuration of an entropy encoding unit according to modification E7 of the present embodiment.
Fig. 36 is a block diagram showing the configuration of the AMVP prediction parameter derivation unit and the prediction parameter memory according to modification D8 of the present embodiment.
Fig. 37 is a diagram showing an example of equations used for coordinate transformation.
Fig. 38 is a block diagram showing the configuration of the merged prediction parameter deriving unit and the prediction parameter memory 307 according to modification D8 of the present embodiment.
Fig. 39 is a diagram showing the configuration of the AMVP prediction parameter derivation unit and the prediction parameter memory according to modification D8 of the present embodiment.
Fig. 40 is a diagram showing an example of equations used for coordinate transformation.
Fig. 41 is a conceptual diagram illustrating another example of the region to be referred to by the coordinate transformation.
Fig. 42 is a conceptual diagram illustrating another example of the region to be referred to by the coordinate transformation.
Fig. 43 is a block diagram showing the configuration of the AMVP prediction parameter derivation unit according to modification E8 of the present embodiment.
Fig. 44 is a block diagram showing a configuration of a merged prediction parameter deriving unit according to modification E8 of the present embodiment.
Detailed Description
(first embodiment)
Embodiments of the present invention will be described below with reference to the drawings.
Fig. 1 is a schematic diagram showing the configuration of an image transmission system 1 according to the present embodiment.
The image transmission system 1 is a system that transmits a symbol in which a plurality of layer images are encoded, and displays an image in which the transmitted symbol is decoded. The image transmission system 1 includes an image encoding device 11, a network 21, an image decoding device 31, and an image display device 41.
The image encoding device 11 receives signals T representing a plurality of layer images (also referred to as texture images). A layer image is an image that is visually recognized or photographed at a certain resolution and at a certain viewpoint. In the case of view scalable encoding that encodes a three-dimensional image using a plurality of layer images, each of the plurality of layer images is referred to as a view image. Here, the viewpoint corresponds to a position or an observation point of the imaging device. For example, the plurality of viewpoint images are images captured by the left and right image capturing devices facing the subject. The image encoding device 11 encodes each of the signals to generate an encoded stream Te. Details of the coded stream Te will be described later. The viewpoint image is a two-dimensional image (plane image) observed from a certain viewpoint.
The viewpoint image is represented by, for example, a luminance value or a color signal value of each pixel arranged in a two-dimensional plane. Hereinafter, the 1 viewpoint image or a signal representing the viewpoint image is referred to as an image (picture). In addition, in the case of performing spatial scalable encoding using a plurality of layer images, the plurality of layer images are composed of a base layer image having a low resolution and an extension layer image having a high resolution. When SNR scalable coding is performed using a plurality of layer pictures, the plurality of layer pictures are composed of a base layer picture having a low picture quality and an extended layer picture having a high picture quality. In addition, any combination of view scalable encoding, spatial scalable encoding, and SNR scalable encoding may be performed.
The network 21 transmits the encoded stream Te generated by the image encoding device 11 to the image decoding device 31. The Network 21 is the internet (internet), a broadband Network (WAN), a Local Area Network (LAN), or a combination of these. The network 21 is not necessarily limited to a bidirectional communication network, and may be a unidirectional or bidirectional communication network for transmission and broadcasting such as terrestrial digital broadcasting and satellite broadcasting. The network 21 may be replaced by a storage medium such as a DVD (digital versatile Disc), a BD (Blue-ray Disc), or the like, on which the coded stream Te is recorded.
The image decoding device 31 decodes each of the coded streams Te transmitted by the network 21, and generates a plurality of decoded layer images Td (decoded view images Td) on which decoding has been performed, respectively.
The image display device 41 displays all or a part of the plurality of decoded layer images Td generated by the image decoding device 31. For example, in view scalable encoding, a three-dimensional image (stereoscopic image) or a free viewpoint image is displayed in all cases, and a two-dimensional image is displayed in some cases. The image display device 41 includes, for example, a display device such as a liquid crystal display, an organic EL (Electro-luminescence) display, or the like. In spatial scalable coding and SNR scalable coding, when the image decoding apparatus 31 and the image display apparatus 41 have high processing capabilities, an extended layer image with high image quality is displayed, and when only the image decoding apparatus has low processing capabilities, a base layer image that does not require as high processing capabilities or display capabilities as the extended layer is displayed.
(construction of image decoding apparatus)
Next, the configuration of the image decoding device 31 according to the present embodiment will be described.
Fig. 2 is a schematic diagram showing the configuration of the image decoding device 31 according to the present embodiment.
The image decoding device 31 includes an entropy decoding unit 301, a prediction parameter decoding unit 302, a reference image memory (reference image storage unit) 306, a prediction parameter memory (prediction parameter storage unit) 307, a predicted image generating unit (predicted image generating unit) 308, an inverse quantization/inverse DCT unit 311, and an adding unit 312.
The prediction parameter decoding unit 302 includes an external prediction parameter decoding unit (displacement vector generating unit) 303 and an internal prediction parameter decoding unit 304. The predicted image generating unit 308 includes an external predicted image generating unit 309 and an internal predicted image generating unit 310.
The entropy decoding unit 301 entropy-decodes the coded stream Te input from the outside to generate a symbol string, and separates each symbol from the generated symbol string. The separated symbols include prediction information for generating a predicted image, residual information for generating a difference image, and the like. The prediction information includes, for example, a merge flag (merge _ flag), an outer prediction flag, a reference picture index refIdx (refIdxLX), a prediction vector index idx (mvp _ LX _ idx), a difference vector mvd (mvdLX), a prediction mode predMode, and a merge index merge _ idx, and the residual information includes a quantization parameter and a quantization coefficient.
These symbols are obtained for each image block. An image block is 1 of a plurality of parts into which 1 image is divided, or a signal indicating the part. In 1 block, a plurality of pixels, for example, 8 pixels in the horizontal direction and 8 pixels in the vertical direction (64 pixels in total) are included. In the following description, an image block is sometimes simply referred to as a block. The block to be decoded is referred to as a decoding target block, and the block related to the reference picture is referred to as a reference block. In addition, a block to be subjected to encoding processing is referred to as an encoding target block.
As will be described later, the image block as a unit for generating a predicted image and the image block as a unit for deriving a quantization coefficient may have different sizes. A picture block, which is a Unit for generating a Prediction image, is called a Prediction Unit (PU). An image block that is a unit for deriving a quantized coefficient is called a Transform Unit (TU).
The external prediction flag is data indicating the type and number of reference pictures used for external prediction described later, and is any one of Pred _ L0, Pred _ L1, and Pred _ Bi. Pred _ L0 and Pred _ L1 both indicate the use of 1 reference picture (uni-prediction). Pred _ L0 and Pred _ L1 indicate the use of reference pictures stored in a reference picture list called L0 list and L1 list, respectively. Pred _ Bi indicates that 2 reference pictures are used (Bi-prediction), indicating that both the reference pictures stored in the L0 list and the reference pictures stored in the L1 list are used. In the uni-prediction, prediction using a past reference picture with reference to a reference picture stored in the L0 list, that is, a decoded (or encoded) target picture, is referred to as L0 prediction. Prediction using a future reference picture with reference to a reference picture stored in the L1 list, that is, a decoded (or encoded) target picture, is referred to as L1 prediction. The prediction vector index mvp _ LX _ idx is an index indicating a prediction vector, and the reference picture index refIdx is an index indicating a reference picture stored in the reference picture list. Further, a reference picture index used in the L0 prediction is denoted by refIdxL0, and a reference picture index used in the L1 prediction is denoted by refIdxL 1. That is, refIdx (refIdxLX) is an expression used when refIdxL0 and refIdxL1 are not distinguished.
Similarly, the difference vector mvd (mvdLX) is expressed by L0 or L1 instead of LX, and is used for L0 prediction and L1 prediction.
The merge index merge _ idx is an index indicating which motion compensation parameter of the motion compensation parameter candidates (merge candidates) derived from the block whose decoding process is completed is used as the motion compensation parameter of the decoding target block. As described later, the merge index merge _ idx is an index indicating a reference block when the prediction parameter is referred to in the prediction parameter memory 307. The merge index merge _ idx is a symbol obtained by merge coding. The motion compensation parameters, i.e., the prediction parameters to be referred to are, for example, the reference picture index refIdxLX and the vector mvLX. The vector mvLX will be described later.
The entropy decoding unit 301 outputs a part of the separated symbols to the prediction parameter decoding unit 302. The separated part of the symbol is, for example, a reference picture index refIdx, a vector index idx, a difference vector mvd, a prediction mode predMode, and a merge index merge _ idx. The entropy decoding unit 301 outputs the prediction mode predMode to the prediction image generation unit 308, and stores it in the prediction parameter memory 307. The entropy decoding unit 301 outputs the quantized coefficient to the inverse quantization/inverse DCT unit 311. The quantized coefficients are coefficients obtained by performing DCT (Discrete Cosine Transform) on the residual signal and quantizing the residual signal in the encoding process.
The external prediction parameter decoding unit 303 refers to the prediction parameters stored in the prediction parameter memory 307 based on the symbols input from the entropy decoding unit 301, and decodes the external prediction parameters. The external prediction is a prediction process performed between different pictures (for example, between time points and between layer pictures). The extrinsic prediction parameters are parameters used for extrinsic prediction of the picture block, such as the vector mvLX and the reference picture index refIdxLX.
Among the vectors mvLX, there are motion vectors and displacement vectors (disparity vectors). A motion vector is a vector representing the offset of the position between the position of a block in a picture of a certain layer at a certain time and the position of the corresponding block in a picture of the same layer at a different time (e.g., adjacent discrete time). The displacement vector is a vector indicating a displacement between a position of a block in an image of a certain layer at a certain time and a position of a corresponding block in an image of a different layer at the same time. As the images of different layers, there are cases of images of different viewpoints, images of different resolutions, and the like. In particular, a displacement vector corresponding to an image of a different viewpoint is referred to as a disparity vector. In the following description, the vector mvLX is simply referred to as a vector without distinguishing between a motion vector and a displacement vector. The prediction vector and the difference vector relating to the vector mvLX are referred to as a prediction vector mvpLX and a difference vector mvdLX, respectively. The prediction vector mvpLX is a vector obtained by performing prediction processing on the vector mvLX.
In the following description, unless otherwise mentioned, a case where the vector mvLX is a two-dimensional vector including a component value (X component) along the horizontal direction (X direction) and a component value (Y component) along the vertical direction (Y direction) is exemplified. That is, when the vector mvLX is a displacement vector (disparity vector), the X component thereof is a part of information indicating displacement (disparity), and the Y component thereof is another part of information indicating displacement (disparity). In addition, regarding the difference vector mvdLX, a motion vector and a displacement vector are also distinguished. The determination of the vector type of the vector mvLX and the difference vector mvdLX is performed using the reference image index refIdxLX attached to the vector as described later.
When it is explicitly shown that the vector mvLX is a displacement vector, the vector is hereinafter referred to as a displacement vector dvLX. The prediction vector and the difference vector relating to the displacement vector dvLX are referred to as a prediction vector dvpLX and a difference vector dvdLX, respectively. When the vector mvLX is a motion vector, the vector may be referred to as a motion vector mvLX. The prediction vector and the difference vector relating to the motion vector mvLX may be simply referred to as a prediction vector mvpLX and a difference vector mvdLX, respectively.
The extrinsic prediction parameter decoding unit 303 outputs the decoded extrinsic prediction parameters to the predicted image generation unit 308, and stores the parameters in the prediction parameter memory 307. Details of the external prediction parameter decoding unit 303 will be described later.
The intra-prediction parameter decoding unit 304 refers to the prediction parameters stored in the prediction parameter memory 307 and decodes the intra-prediction parameters based on the symbol input from the entropy decoding unit 301, for example, the prediction mode PredMode. The intra prediction parameter is a parameter used in the process of predicting a picture block within 1 picture, such as intra prediction mode IntraPredMode. The intra-prediction parameter decoding unit 304 outputs the decoded intra-prediction parameters to the predicted image generating unit 308, and stores the parameters in the prediction parameter memory 307.
The reference picture memory 306 stores the block of the reference picture (reference picture block) generated by the addition unit 312 at a predetermined position for each picture and block to be decoded.
The prediction parameter storage 307 stores the prediction parameters at predetermined positions for each image and block to be decoded. Specifically, the prediction parameter memory 307 stores the outer prediction parameters decoded by the outer prediction parameter decoding unit 303, the inner prediction parameters decoded by the inner prediction parameter decoding unit 304, and the prediction mode predMode separated by the entropy decoding unit 301. Examples of the stored external prediction parameters include an external prediction flag, a reference picture index refIdxLX, and a vector mvLX.
The prediction mode predMode input from the entropy decoding unit 301 is input to the prediction image generation unit 308, and the prediction parameters are input from the prediction parameter decoding unit 302. The predicted image generating unit 308 reads the reference image from the reference image memory 306. The predicted image generation unit 308 generates a predicted image block P using the input prediction parameter and the read reference image in the prediction mode indicated by the prediction mode predMode.
Here, when the prediction mode predMode indicates the external prediction mode, the external prediction image generation unit 309 performs external prediction using the external prediction parameters input from the external prediction parameter decoding unit 303 and the read reference image. The external prediction is performed for each PU. As described above, a PU corresponds to a block to be decoded that is a part of an image composed of a plurality of pixels that are a unit for performing prediction processing, i.e., that performs prediction processing at a time. In the external Prediction, there are a merge Prediction (merge) mode and an AMVP (Adaptive Motion Vector Prediction) mode. In both the merge prediction mode and the AMVP mode, prediction parameters are derived using prediction parameters of already processed blocks. The merge prediction mode is a mode in which a differential vector is not decoded from encoded data, but a vector of a prediction parameter is used as it is as a vector of a block to be decoded, and the AMVP mode is a mode in which a differential vector is decoded from encoded data, and the sum of the vector of the prediction parameter and the vector of the prediction parameter is used as a vector of a block to be decoded. Whether the prediction mode is the merge prediction mode or the AMVP mode is identified by the value of the merge flag. These prediction modes will be described later.
The external predicted image generating unit 309 reads out, from the reference image memory 306, a reference image block located at a position indicated by the vector mvLX with reference to the decoding target block from the reference image indicated by the reference image index refIdxLX. The external predicted image generation unit 309 performs prediction on the read reference image block to generate a predicted image block P. The external predicted image generation section 309 outputs the generated predicted image block P to the addition section 312.
When the prediction mode predMode indicates the intra prediction mode, the intra prediction image generation unit 310 performs intra prediction using the intra prediction parameters input from the intra prediction parameter decoding unit 304 and the read reference image. Specifically, the intra-prediction image generation unit 310 reads out the image to be decoded from the reference image memory 306, and is a reference image block in which the block to be subjected to the decoding is located within a predetermined range among the already decoded blocks. When the decoding target block moves sequentially in the order of so-called raster scan, the predetermined range is, for example, any one of adjacent blocks on the left, upper right, and differs depending on the intra prediction mode. The raster scanning order is an order of moving from the left end to the right end in order for each line from the upper end to the lower end in each image.
The intra prediction image generation unit 310 generates a prediction image block by predicting the read reference image block in the prediction mode indicated by the intra prediction mode IntraPredMode. The intra prediction image generation unit 310 outputs the generated prediction image block P to the addition unit 312.
The inverse quantization/inverse DCT unit 311 inversely quantizes the quantization coefficient input from the entropy decoding unit 301 to obtain a DCT coefficient. The inverse quantization/inverse DCT unit 311 performs inverse DCT (inverse discrete Cosine Transform) on the obtained DCT coefficient to calculate a decoded residual signal. The inverse quantization/inverse DCT unit 311 outputs the calculated decoded residual signal to the addition unit 312.
The adder 312 adds the signal values of the predicted image block P input from the external predicted image generator 309 and the internal predicted image generator 310 and the decoded residual signal input from the inverse quantization/inverse DCT311 for each pixel, thereby generating a reference image block. The adder 312 stores the generated reference image block in the reference image memory 306, and outputs the decoded layer image Td, which is obtained by combining the generated reference image block for each image, to the outside.
(configuration of external predictive decoding section)
Next, the configuration of the extrinsic prediction parameter decoding unit 303 will be described.
Fig. 3 is a schematic diagram showing the configuration of the external prediction parameter decoding unit 303 according to the present embodiment.
The extrinsic prediction parameter decoding unit 303 includes an extrinsic prediction parameter extraction unit 3031, an AMVP prediction parameter derivation unit 3032, an addition unit 3035, and a merged prediction parameter derivation unit 3036.
The extrinsic prediction parameter extraction unit 3031 extracts prediction parameters for each prediction mode from the input extrinsic prediction parameters based on the merge flag input from the entropy decoding unit 301.
Here, when the value indicated by the merge flag is 1, that is, when the merge prediction mode is indicated, the external prediction parameter extraction unit 3031 extracts, for example, a merge index merge _ idx as a prediction parameter relating to merge prediction. The external prediction parameter extraction unit 3031 outputs the extracted merge index merge _ idx to the merge prediction parameter derivation unit 3036.
When the value indicated by the merge flag is 0, that is, when the AMVP prediction mode is indicated, the external prediction parameter extraction unit 3031 extracts the AMVP prediction parameter.
Examples of the AMVP prediction parameters include an external prediction flag, a reference picture index refIdxLX, a vector index mvp _ LX _ idx, and a difference vector mvdLX. The external prediction parameter extraction unit 3031 outputs the extracted reference picture index refIdxLX to the AMVP prediction parameter derivation unit 3032 and the predicted picture generation unit 308 (fig. 2), and stores the reference picture index refIdxLX in the predicted parameter memory 307 (fig. 2). The external prediction parameter extraction unit 3031 outputs the extracted vector index mvp _ LX _ idx to the AMVP prediction parameter derivation unit 3032. The extrinsic prediction parameter extraction unit 3031 outputs the extracted difference vector mvdLX to the addition unit 3035. The configuration of the extrinsic prediction parameter extractor 3031 will be described later.
The AMVP prediction parameter derivation unit 3032 includes a vector candidate derivation unit 3033 and a prediction vector selection unit 3034.
The vector candidate derivation unit 3033 reads out the vector (motion vector or displacement vector) stored in the prediction parameter memory 307 (fig. 2) as a vector candidate based on the reference picture index refIdx.
The read vector is a vector relating to each of blocks (for example, all or a part of blocks connected to the lower left end, upper left end, and upper right end of the decoding target block) whose target block of the dissociation code is located within a predetermined range.
The prediction vector selection unit 3034 selects, as the prediction vector mvpLX, a vector candidate indicated by the vector index mvp _ LX _ idx input from the external prediction parameter extraction unit 3031 among the vector candidates read by the vector candidate derivation unit 3033. The prediction vector selection unit 3034 outputs the selected prediction vector mvpLX to the addition unit 3035.
The adder 3035 adds the prediction vector mvpLX input from the prediction vector selector 3034 and the difference vector mvdLX input from the external prediction parameter extractor to calculate the vector mvLX. The adder 3035 outputs the calculated vector mvLX to the predicted image generator 308 (fig. 2).
The merge prediction parameter derivation unit 3036 includes a merge candidate derivation unit and a merge candidate selection unit, which are not shown. The merge candidate derivation section reads out the prediction parameters (the vector mvLX and the reference image index refIdxLX) stored in the prediction parameter memory 307 according to a predetermined rule, and derives the read-out prediction parameters as merge candidates. The merge candidate selection unit included in the merge prediction parameter derivation unit 3036 selects, from the derived merge candidates, the merge candidate (the vector mvLX and the reference picture index refIdxLX) indicated by the merge index merge _ idx input from the external prediction parameter extraction unit 3031. The merge candidates selected by the merge prediction parameter derivation unit 3036 are stored in the prediction parameter memory 307 (fig. 2), and are output to the predicted image generation unit 308 (fig. 2).
(configuration of external prediction parameter extraction section)
Next, the configuration of the extrinsic prediction parameter extraction unit 3031 will be described.
Fig. 4 is a schematic diagram showing the configuration of the extrinsic prediction parameter extraction unit 3031 according to the present embodiment, particularly the configuration related to decoding of the difference vector mvdLX.
The extrinsic prediction parameter extraction unit 3031 includes a reference layer determination unit 30311 and a vector difference decoding unit 30312. The reference layer determination unit 30311 is referred to as an outer layer prediction determination unit, an outer view angle prediction determination unit, and a displacement vector determination unit.
The reference layer determination unit 30311 specifies the reference layer information reference _ layer _ info indicating the relationship between the reference image indicated by the reference image index refIdxLX and the target image, based on the reference image index refIdxLX input from the entropy decoding unit 301. When the reference image index refIdxLX is attached to the vector mvLX, the reference layer information of the vector mvLX is derived, and when the reference image index refIdxLX is attached to the differential vector mvdLX, the reference layer information of the differential vector is derived. The reference layer information reference _ layer _ info is information indicating whether or not a layer of the target picture and a layer of the reference picture are predicted in the case of different layers. When the layer of the target image is a certain viewpoint image and the layer of the reference image is a viewpoint image different from the certain viewpoint image, the outer layer prediction corresponds to the disparity vector. Therefore, the reference layer information reference _ layer _ info is also information indicating whether the vector mvLX for the reference picture is a displacement vector or a motion vector.
Prediction in the case where the layer of the target image and the layer of the reference image are the same layer is called same-layer prediction, and the vector obtained at this time is a motion vector. Prediction in the case where the layer of the target image and the layer of the reference image are different layers is called outer layer prediction, and a vector obtained at this time is a displacement vector. The reference layer information reference _ layer _ info is represented by, for example, a variable whose value is 1 in the case where the vector mvLX is a vector (displacement vector, disparity vector) predicted by an external layer, and whose value is 0 in the case where the vector mvLX is a vector (motion vector) predicted by the same layer. The reference layer determination unit 30311 outputs the generated reference layer information reference _ layer _ info to the vector difference decoding unit 30312. The reference layer determining unit 30311 may also include, instead of the reference picture index refIdxLX, a POC or view identifier (view _ id) described later, and a flag indicating whether or not the reference picture is a long-term reference picture.
Here, the first to fourth determination methods will be described with reference to examples of the determination process of the layer determination unit 30311. The reference layer determination unit 30311 may also use any one of the first to fourth determination methods or any combination of these methods.
< first decision method >
The reference layer determining unit 30311 determines that the vector mvLX is a displacement vector when the time (Picture Order Count) of the reference Picture indicated by the reference Picture index refIdxLX is equal to the time (POC) of the decoding target Picture. POC is a number indicating the order in which images are displayed, and is an integer (discrete time) indicating the time at which the image is acquired. When it is not determined as the displacement vector, the reference layer determination unit 30311 determines that the vector mvLX is the motion vector.
Specifically, when the picture sequence number POC (PictureOrder Count) of the reference picture indicated by the reference picture index refIdxLX is equal to the POC of the decoding target picture, the reference layer determination unit 30311 determines, for example, by the following expression if the vector mvLX is a displacement vector.
POC==ReflayerPOC(refIdxLX,ListX)
Here, POC is POC of the decoding target picture, and RefPOC (X, Y) is POC of the reference picture designated by the reference picture index X and the reference picture list Y.
In addition, a reference picture that can refer to the POC equal to the POC of the decoding target picture means that the layer of the reference picture is different from the layer of the decoding target picture. Therefore, when the POC of the target picture to be decoded and the POC of the reference picture are equal, it is determined that the outer layer prediction (displacement vector) is performed, and otherwise, it is determined that the same layer prediction (motion vector) is performed.
< second determination method >
The reference layer determination unit 30311 may determine the vector mvLX as a displacement vector (disparity vector, and the outer layer prediction is performed) when the viewpoint of the reference image indicated by the reference image index refIdxLX and the viewpoint of the decoding target image are different from each other. Specifically, when the view angle identifier view _ id of the reference picture indicated by the reference picture index refIdxLX is different from the view angle identifier view _ id of the decoding target picture, the reference layer determination unit 30311 determines, for example, by the following equation if the vector mvLX is a displacement vector.
ViewID==ReflayerViewID(refIdxLX,ListX)
Here, ViewID is a view ID of a decoding target picture, and RefViewID (X, Y) is a view ID of a reference picture specified by a reference picture index X and a reference picture list Y.
The view identifier view _ id is information identifying each view image. The difference vector dvdLX to which the displacement vector relates is based on a rule that is obtained between images of different viewpoints but cannot be obtained between images of the same viewpoint. When it is not determined that the vector is a displacement vector, the reference layer determination unit 30311 determines that the vector mvLX is a motion vector.
Since each view image is a type of layer, the reference layer determination unit 30311 determines the vector mvLX as a displacement vector (subjected to the outer layer prediction) when it is determined that the view identifier view _ id is different from each other, and determines a motion vector (subjected to the same layer prediction) otherwise.
< third decision method >
The reference layer determination unit 30311 may determine, for example, by the following equation when the vector mvLX is a displacement vector when the layer identifier layer _ id for the reference image indicated by the reference image index refIdxLX and the layer identifier layer _ id for the decoded target image are different from each other.
layerID!=ReflayerID(refIdxLX,ListX)
Here, layerID is a layer ID of a decoding target picture, and refllayerid (X, Y) is a layer ID of a reference picture specified by a reference picture index X and a reference picture list Y.
The layer identifier layer _ id is data for identifying each layer when 1 picture includes data of a plurality of layers (layers). In coded data in which pictures of different viewpoints are coded, a layer identifier is based on a rule having a different value according to the viewpoint. That is, the difference vector dvdLX to which the displacement vector relates is a vector obtained between the object image and the image to which the different layer relates. When it is not determined that the vector is a displacement vector, the reference layer determination unit 30311 determines that the vector mvLX is a motion vector.
When the layer identifiers layer _ id are different from each other, the reference layer determination unit 30311 determines the vector mvLX as a displacement vector (subjected to the outer layer prediction), and otherwise determines the vector mvLX as a motion vector (subjected to the same layer prediction).
< fourth decision method >
In addition, when the reference image indicated by the reference image index refIdxLX is a long-time reference image, the reference layer determination unit 30311 may determine the reference image by the following equation, for example, if the vector mvLX is a displacement vector.
LongTermPic(RefIdxLX,ListX)
Here, LongTermPic (X, Y) is a function that becomes true when the reference image specified by the reference image index X, the reference image list Y is a long-time reference image.
When it is not determined that the vector is a displacement vector, the reference layer determination unit 30311 determines that the vector mvLX is a motion vector. The long-time reference image is an image managed as a reference image list, unlike the short-time image. The long-time reference image is mainly a reference image held in the reference image memory 306 for a longer time than the short-time image, and a basic view image described later is used as the long-time reference image. This is because the target image is a non-base view at the same time as the reference image (described later). The reference image related to the displacement vector is processed as a long-term reference image. That is, the difference vector dvdLX to which the displacement vector relates is a vector obtained between the target image and the long-time reference image.
In addition, even when the reference picture is a long-time reference picture, the reference picture may be the same layer as the decoding target picture. Therefore, the reference layer determination unit 30311 determines the vector mvLX as a displacement vector (subjected to the outer layer prediction) when the reference image is a long-time reference image, and determines the vector mvLX as a motion vector (subjected to the same layer prediction) otherwise. This is because in this method, although it cannot be strictly determined whether it is an outer layer prediction or the same layer prediction, the strictness of determination becomes unimportant in the purpose of tool control for improving the coding efficiency.
The vector difference decoding unit 30312 receives the reference layer information reference _ layer _ info from the reference layer determining unit 30311, decodes a syntax element for deriving the difference vector mvdLX using the entropy decoding unit 301, and outputs the difference vector mvdLX. The syntax element is a structural element of the coding parameter. Here, syntax elements used to derive the difference vector mvdLX are abs _ mvd _ header 0_ flag [ XY ], abs _ mvd _ header 1_ flag [ XY ], abs _ mvd _ minus2_ flag [ XY ]. Here, XY is a variable that takes 0 or 1 as its value, and is information indicating an X component (horizontal component) when XY is 0, and indicating a Y component (vertical component) when XY is 1.
When the reference layer information reference _ layer _ info is 1, that is, the vector mvLX is a displacement vector (subjected to outer layer prediction), the vector differential decoding unit 30312 decodes a component value (X component) dvd _ X for the horizontal direction (X direction) from the coded stream Te, and determines a component value (Y component) dvd _ Y for the vertical direction (Y direction) to be a predetermined value, for example, zero. The vector differential decoding unit 30312 combines the X component dvd _ X and the Y component dvd _ Y to form a differential vector dvdLX, thereby decoding the differential vector dvdLX.
On the other hand, when the vector mvLX is a motion vector, the vector differential decoding unit 30312 decodes the X component mvd _ X and the Y component mvd _ Y from the coded stream Te and outputs them as a differential vector mvdLX.
The vector difference decoding unit 30312 outputs the decoded difference vector mvdLX (or dvdLX) to the addition unit 3035.
Thus, in the above-described example, even if the Y component dvd _ Y of the differential vector dvlx relating to the displacement vector is not encoded as the symbol of the bit stream Te, a value predetermined on the image decoding apparatus 31 side is used as the value of the Y component. On the other hand, in the case where the X component is dominant and the Y component is a negligible value, for example, zero, in the displacement vector, the accuracy in the displacement prediction does not deteriorate in the above example. Therefore, in the above example, there is an effect that the quality of the decoded image is not deteriorated and the encoding efficiency is improved.
Here, an example of the coded data structure (syntax table) to be executed by the extrinsic prediction parameter extraction unit 3031 will be described.
Fig. 21 is a diagram showing an example of the structure of encoded data.
Fig. 21(a) to (c) show a syntax 603, and the syntax 603 shows that a view arrangement flag (camera arrangement 1D flag) camera _ arrangement _1D _ flag is extracted from the symbols input from the entropy decoding unit 301. The view configuration flag camera _ arrangement _1D _ flag is a symbol indicating whether a view is one-dimensionally configured. For example, in the case where the value of the view configuration flag camera _ arrangement _1D _ flag is 1, it indicates that the views are one-dimensionally configured. In the case where the value of the view configuration flag camera _ arrangement _1D _ flag is 0, it indicates that the view is not one-dimensionally configured. Accordingly, the vector differential decoding unit 30312 extracts the view configuration flag camera _ arrangement _1D _ flag from the syntax 603.
Fig. 21(a) is an example of the camera parameter set video _ parameter _ set _ rbsp (), (b) is an example of the picture parameter set pic _ parameter _ set _ rbsp, and (c) is an example of the picture header slice _ header (). In this example, when the view identifier view _ id is other than 0 (when it is other than the base view), the view allocation flag camera _ arrangement _1D _ flag is extracted, but may be extracted without referring to the view identifier view _ id.
Fig. 22 is a diagram showing an example of a coded stream structure.
In fig. 22, in the case where the value of the VIEW configuration flag camera _ arrangement _1D _ flag is 1 and the value of the reference layer information reference _ layer _ info is equal to the value of INTER _ VIEW (i.e., 1), the syntax 604 indicates that the syntax mvd _ coding1D is included in the coded stream. In addition, in other cases, the syntax 604 indicates that syntax mvd _ coding is included in the coded stream.
That is, when the syntax 604 indicates that the views are one-dimensionally arranged (camera _ arrangement _1D _ flag is 1), the vector difference decoding unit 30312 executes syntax for extracting different symbols depending on whether the vector mvLX is a displacement vector or a motion vector.
Syntax mvd _ coding1D and syntax mvd _ coding are shown as syntax 605 and syntax 606 in fig. 22, respectively. The outline of the meaning of the syntax elements abs _ mvd _ header 0_ flag [ XY ], abs _ mvd _ header 1_ flag [ XY ], abs _ mvd _ minus2_ flag [ XY ], abs _ mvd _ sign _ flag [ XY ] is as already described. When the flags indicating that the viewpoints are one-dimensionally arranged are 1 (affirmative) by comparing the syntax 605 and the syntax 606, only the suffix XY of each syntax element is 0 (negative), that is, the X component of the difference vector is included in the code stream Te. That is, the vector difference decoding unit 30312 extracts (decodes) only the X component of the difference vector. Otherwise, when the flag indicating that the views are one-dimensionally arranged is 0, both the case where the suffix XY of each syntax element is 0 and the case where the suffix XY of each syntax element is 1 are decoded. That is, the vector difference decoding unit 30312 extracts (decodes) both the X component and the Y component. The details of the syntax mvd _ coding1D will be described below.
The syntax mvd _ coding1D is a syntax indicating that 3 kinds of symbols representing the X component mvdLX [0] of the difference vector relating to the displacement vector, abs _ mvd _ generator 0_ flag [0], abs _ mvd _ minus2[0], and mvd _ sign _ flag [0] are extracted. abs _ mvd _ grease 0_ flag [0] is a symbol indicating whether the absolute value of the X-component mvdLX [0] is greater than 0. When the absolute value of the X component mvdLX [0] is greater than 0, the value of abs _ mvd _ grease 0_ flag [0] is 1. When the absolute value of the X component mvdLX [0] is 0, the value of abs _ mvd _ grease 0_ flag [0] is 0.
abs _ mvd _ minus2[0] is a sign indicating that when the absolute value of X component mvdLX [0] is greater than 1, the absolute value is 2 less. mvd _ sign _ flag [0] is a sign indicating the sign of the X component mvdLX [0 ]. In the case where the X component mvdLX [0] takes a positive value, the value of mvd _ sign _ flag [0] is 1. In the case where the X component mvdLX [0] takes a negative value, the value of mvd _ sign _ flag [1] is 0.
Therefore, the vector difference decoding unit 30312 executes syntax 605 to extract these 3 types of symbols, abs _ mvd _ symbol 0_ flag [0], abs _ mvd _ minus2[0], and mvd _ sign _ flag [0 ]. Then, the vector difference decoding unit 30312 calculates the X component mvdLX [0] of the difference vector using the following equation.
if(!abs_greater0_flag[0])abs_mvd_minus2[0]=-2
if(!abs_greater1_flag[0])abs_mvd_minus2[0]=-1
mvdLX[0]=(abs_mvd_minus2[0]+2)*(1-2*mvd_sign_flag[0])
That is, the first 2 formulas indicate that abs _ greater0_ flag [0] is set to-2 when abs _ grease _ minus2[ XY ] is 0, and that abs _ grease 1_ minus2[0] is set to-1 when abs _ grease _ flag is 0. The following expression indicates that the value obtained by adding 2 to abs _ mvd _ minus2[0] determines either positive or negative from mvd _ sign _ flag [0] and determines the value mvdLX [0] of the X component of the differential vector. The vector differential decoding unit 30312 determines the Y component mvdLX [1] of the differential vector to be 0.
In addition, the coded stream structure may be a structure in which only syntax mvd _ coding is used instead of switching the syntax mvd _ coding1D and syntax mvd _ coding according to the one-dimensional arrangement of views (camera _ arrangement _1D _ flag ═ 1) by the syntax 604. In this case, as shown in the following equation, the Y component of the differential vector (abs _ mvd _ grease 0_ flag [1]) is included in the coded stream only when camera _ arrangement _1D _ flag is 1 and reference _ layer _ info (ref _ idx _ lX [ x0] [ Y0]) is INTER _ VIEW.
if(camera_arrangement_1D_flag==1&&!reference_layer_info(ref_idx_1X[x0][y0])==INTER_VIEW)
abs_mvd_greater0_flag[1]
In the case of the above-described coded stream structure, the coded data vector differential decoding unit 30312 decodes the Y component (abs _ mvd _ header 0_ flag [1]) of the differential vector only in the case where the camera _ arrangement _1D _ flag is 1 and the reference _ layer _ info (ref _ idx _1X [ X0] [ Y0]) is INTER _ VIEW.
When the flag (abs _ mvd _ grease 0_ flag [1]) indicating whether or not the absolute value of the Y component of the differential vector exceeds 0 is not decoded, the coded data vector differential decoding unit 30312 sets 0 to the flag (abs _ mvd _ grease 0_ flag [1] ═ 0).
(example of reference image List)
Next, an example of the reference picture list will be described. The reference picture list is a column made up of reference pictures stored in the reference picture memory 306 (fig. 2).
Fig. 17 is a conceptual diagram illustrating an example of the reference picture list.
In the reference image list 601, 5 rectangles arranged in a row on the left and right represent reference images, respectively. The symbols P1, P2, Q0, P3, and P4 shown in this order from the left end to the right end represent the respective reference images. P of P1, etc. represents a viewpoint P, and Q of Q0 represents a viewpoint Q different from viewpoint P. The suffixes of P and Q indicate the picture sequence number POC. The downward arrow immediately below refIdxLX indicates that reference image index refIdxLX is an index of reference image Q0 in reference image memory 306.
(an example of vector candidates)
Next, an example of the above-described vector candidates will be described.
Fig. 18 is a conceptual diagram illustrating an example of vector candidates.
The prediction vector list 602 shown in fig. 18 is a list including a plurality of vector candidates derived by the vector candidate derivation unit 3033.
In the prediction vector list 602, 5 rectangles arranged in a row on the left and right represent regions showing prediction vectors. The arrow directly below the 2 nd mvp _ LX _ idx from the left end and pmv below it indicate that the vector index mvp _ LX _ idx is an index referring to the vector pmv in the prediction parameter memory 307.
(acquisition of candidate vector)
Next, an example of a method of acquiring a candidate vector will be described. The candidate vector is generated based on the vector relating to the block to be referred to, referring to the block for which the decoding process is completed and which is a block (for example, an adjacent block) within a predetermined range from the block to be subjected to the dissociation code.
Fig. 20 is a conceptual diagram illustrating an example of an adjacent block.
The PU indicated by a quadrangle in fig. 20 is a decoding target block. The neighboring blocks are represented by 2 quadrangles NBa0, NBa1 adjacent to the left side of the PU, and 3 quadrangles NBb2, NBb1, NBb0 adjacent to the upper side of the PU.
The vector candidate derivation unit 3033 (fig. 3) reads vectors from the prediction parameter memory 307 in sequence from each of the 2 blocks NBa0 and NBa1 adjacent on the left side of the PU, and derives 1 candidate vector based on the read vectors. The vector candidate derivation unit 3033 reads vectors from the prediction parameter memory 307 in sequence from each of the 3 blocks NBb2, NBb1, NBb0 adjacent to the upper side of the PU, and derives 1 candidate vector based on the read vectors.
When deriving the candidate vector, the vector candidate derivation unit 3033 preferentially performs the processing described below in the order of (1) and (2).
(1) When the reference picture of the adjacent block is identical to the reference picture indicated by the reference picture index refIdx and the prediction direction LX of the decoding target block, the vector of the adjacent block is read. Thus, when the vector indicated by the reference picture index refIdx and the prediction direction LX is a displacement vector, it is determined that the vector referred to in the adjacent block is a displacement vector.
However, the prediction parameters to be referred to in the neighboring blocks include a prediction parameter for a prediction direction indicated by LX and a prediction parameter for a prediction direction (LY) different from the prediction direction indicated by LX. Here, the vector relating to the prediction direction indicated by LX and the vector relating to the prediction direction indicated by LY are referred to in this order.
When the vector of the prediction direction indicated by LX is read, it is determined whether or not the reference picture index refIdx of the reference picture and the reference picture index of the decoded picture are equal to each other. This makes it possible to determine whether or not the image to be decoded is the same as the image from which the vector has been read.
When the vector relating to the prediction direction indicated by LY is read, it is determined whether or not the picture sequence number POC of the picture from which the vector is read and the POC of the decoding target picture are equal to each other. This makes it possible to determine whether or not the image to be decoded is the same as the image from which the vector has been read. This is because, even if the reference picture index is the same, the picture to be referred to differs depending on the prediction direction.
When it is determined that the vector cannot be read from the same image as the image to be decoded by (2) and (1), the following processing is performed.
When both the reference picture to be referred to in the adjacent block and the reference picture indicated by the reference picture index refIdxLX of the block to be decoded are long-time reference pictures, the vectors are referred to in the order of the vector indicated by LX and the vector indicated by LY. The reference image relating to the base view is stored as a long-time reference image in the reference image memory 306. That is, when the reference picture indicated by the reference picture index refIdxLX is a long-time reference picture, the vector relating to the block of the reference picture is a displacement vector. Therefore, the displacement vectors of the adjacent blocks are referred to.
When the reference picture to be referred to by the adjacent block and the reference picture indicated by the reference picture index refIdxLX of the decoding target block are both short-time reference pictures, the vector read from the adjacent block is subjected to scaling processing. By the scaling processing, the value of the read vector takes a value within a predetermined range. The short-time reference picture is a reference picture other than the long-time reference picture, and is a reference picture in which only a predetermined time is stored in the reference picture memory 306 and which is deleted when the time elapses.
In addition to (1) and (2), when depth information (depth map) indicating a distance in the depth direction of the object is input, the vector candidate derivation unit 3033 may extract depth information in the decoding target block and calculate a displacement vector indicating a displacement having a magnitude corresponding to the extracted depth information as a candidate vector.
(example of reference image)
Next, an example of a reference picture used for deriving a vector will be described.
Fig. 19 is a conceptual diagram showing an example of a reference image.
In fig. 19, the horizontal axis represents time and the vertical axis represents viewpoint.
The rectangles 2 rows in the vertical direction and 3 columns in the horizontal direction (6 in total) shown in fig. 19 represent images, respectively. Of the 6 rectangles, the rectangle in the 2 nd column from the left side of the lower row represents an image to be decoded (target image), and the remaining 5 rectangles represent reference images, respectively. The reference image Q0 indicated by an arrow upward from the object image is an image which is the same in time as the object image and is different in viewpoint. The reference image Q0 is used for displacement prediction based on the target image. The reference image P1 indicated by an arrow to the left from the object image is the same viewpoint as the object image and is a past image. The reference image P2 indicated by an arrow to the right from the object image is the same viewpoint as the object image and is a future image. For motion prediction based on the target image, the reference image P1 or P2 is used.
(modification D1)
Next, a modification D1 of the present embodiment will be described. The same reference numerals are given to the same structures, and the above description is applied.
The image decoding device 31a according to the present modification includes an AMVP prediction parameter derivation unit 3032a in place of the AMVP prediction parameter derivation unit 3032 in the external prediction parameter decoding unit 303 (fig. 3) of the image decoding device 31. The image decoding device 31a according to modification D1 has the same configuration as the image decoding device 31 (fig. 2) in the other configurations.
Fig. 5 shows a schematic diagram illustrating the configuration of the AMVP prediction parameter derivation unit 3032a according to this modification.
The AMVP prediction parameter derivation unit 3032a includes a displacement prediction vector limiter unit 30321a in addition to the vector candidate derivation unit 3033 and the prediction vector selection unit 3034.
The vector candidate derivation unit 3033 includes an extended vector candidate derivation unit 30335, a vector candidate storage unit 30339, and a base vector candidate derivation unit (described later) not shown.
The extended vector candidate derivation unit 30335 includes a displacement vector acquisition unit 30336 and an outer layer vector candidate derivation unit 30337 (outer view vector candidate derivation unit).
The reference picture index refIdx and the reference layer information reference _ layer _ info are input to the displacement vector obtaining unit 30336 from the external prediction parameter extracting unit 3031 (fig. 3). When the reference layer information reference _ layer _ info indicates that the vector mvLX is a displacement vector, the displacement vector obtaining unit 30336 reads the displacement vector dvLX for the reference picture indicated by the reference picture index refIdx from the prediction parameter memory 307. The read-out displacement vector is a displacement vector dvLX of each of blocks in which the split-code object block is located in a predetermined range. The displacement vector acquisition unit 30336 outputs the read displacement vector dvLX to the outer layer vector candidate derivation unit 30337.
The reference picture index refIdxLX is input from the external prediction parameter extraction unit 3031 (fig. 4) to the external layer vector candidate derivation unit 30337, and the displacement vector dvLX of each block is input from the displacement vector acquisition unit 30336.
The outer layer vector candidate derivation section 30337 specifies the reference picture (layer picture) corresponding to the reference picture index refIdx. When the reference image is a layer image different from the target image, the outer layer vector candidate derivation section 30337 outputs the displacement vector derived by the displacement vector acquisition section 30336. The reference image is a layer image different from the target image, and for example, the reference image may be a non-base view (also referred to as a non-base view) and the target image may be a base view (also referred to as a base view).
When the reference image is the same layer image as the target image, the outer layer vector candidate derivation unit 30337 specifies a parametric reference layer image (here, a layer image of the reference viewpoint) different from the target image, derives a displacement vector for the specified image from the displacement vector acquisition unit 30336, and reads a vector of the parametric reference layer image (layer image of the reference viewpoint) located at a position corresponding to the displacement vector from the prediction parameter memory 307.
More specifically, when the reference picture is a layer picture identical to the target picture, the outer layer vector candidate derivation unit 30337 specifies the position of a block located at a position shifted from the start point by the displacement vector dvLX input from the displacement vector acquisition unit 30336, using the decoding target block as the start point, with respect to the specified picture (layer picture), by using the following expression.
xRef=xP+((nPSW-1)>>1)+((dvLX[0]+2)>>2)
yRef=yP+((nPSH-1)>>1)+((dvLX[1]+2)>>2)
This determined block is referred to as a corresponding block. Here, xRef and yRef are coordinates of corresponding blocks, xP and yP are coordinates of upper left coordinates of each decoding target block, nPSW and nPSH are widths and heights of each decoding target block, and dvLX [0] and dvLX [1] are X-components and Y-components of the displacement vector input from the displacement vector acquisition unit 30336. … > -means that … is shifted to the right by a-bit. That is, the bit shift operator indicates an operation of shifting the bit value of … to lower bits. The outer layer vector candidate derivation unit 30337 reads the vector mvLX for the corresponding block from the prediction parameter memory 307.
The outer layer vector candidate derivation unit 30337 stores the read vector mvLX as a vector candidate in the vector candidate storage unit 30339.
The displacement vector limiter 30321a receives the vector predictor mvpLX and the reference picture index refIdx from the vector predictor selector 3034. The prediction vector mvpLX is a two-dimensional vector including, as elements, a component value mvp _ X for the X direction and a component value mvp _ Y for the Y direction.
Here, when the prediction vector mvpLX is the prediction vector dvpLX based on the displacement vector, that is, when the reference layer determination unit 30311 determines that the external layer prediction is performed using the reference picture index refIdx as an input, the displacement prediction vector limiter 30321a limits the Y component dvp _ Y to a value within a predetermined range. The value within the predetermined range is, for example, 1 value (zero, etc.). In the case where the Y component dvp _ Y is smaller than a predetermined lower limit value (-64 pixels, etc.), it is the lower limit value thereof. When the Y component dvp _ Y is larger than a predetermined upper limit value (64 pixels or the like), it is the upper limit value.
The displacement prediction vector limiter 30321a reconstructs the displacement vector dvpLX from the component value dvp _ X for the X direction and the component value dvp _ Y for the Y direction, the values of which are limited, and outputs the reconstructed displacement vector dvpLX to the adder 3035 (fig. 3).
When the prediction vector mvpLX is a motion prediction vector, that is, when the reference layer determination unit 30311 receives the reference picture index refIdx and does not determine that the external layer prediction is performed, the displacement prediction vector limiter unit 30321a outputs the received motion prediction vector mvpLX to the adder unit 3035.
The reference layer information reference _ layer _ info input from the reference layer determination unit 30311 may be input to the displacement prediction vector limiter 30321 a. The displacement prediction vector limiter 30321a determines that the prediction vector mvpLX is a displacement prediction vector when the input reference layer information reference _ layer _ info indicates that the vector mvLX is a displacement vector.
The process of deriving the candidate of the prediction vector (candidate vector) by the vector candidate derivation unit 3033 will be described later.
The above-described image decoding device 31a includes the AMVP prediction parameter derivation unit 3032a that derives the prediction vector of the block to be decoded, and the reference layer determination unit 30311 that determines whether or not the vector or the vector difference of the block to be decoded is the outer layer prediction that is prediction between different layers, and when the reference layer determination unit 30311 determines that the outer layer prediction is performed, at least one component of the prediction vector of the block to be decoded is limited to a predetermined value.
In general, although the Y component dv _ Y of dvLX of the displacement vector tends to be a predetermined value, for example, distributed so as to be concentrated on the periphery of zero, a value out of the range may be obtained in encoding. In this case, the accuracy in the displacement prediction deteriorates.
In contrast, in modification D1, since the Y component dvp _ Y of the prediction vector relating to the displacement vector is limited to a value in a predetermined range, it is possible to suppress deterioration of accuracy in displacement prediction, and thus, the encoding efficiency is improved.
(modification D2)
Next, another modification D2 of the present embodiment will be described. The same reference numerals are given to the same structures, and the above description is applied.
The configuration of the image decoding device 31b according to modification D2 is the same as the configuration of the image decoding device 31a according to modification D1. That is, the image decoding apparatus 31b includes the displacement prediction vector limiter 30321a (see fig. 5) similarly to the image decoding apparatus 31a according to modification D1. Mainly, the difference between the two will be described below.
When the vector mvLX is a displacement vector, the image decoding device 31b includes displacement vector restriction information disparity _ restriction in the encoded stream Te input to the entropy decoding unit 301 (see fig. 2).
The displacement vector restriction information disparity _ restriction is information indicating whether to restrict the Y component dvp _ Y of the prediction vector to a value of a predetermined range and whether to restrict the Y component dvd _ Y of the difference vector to a predetermined value. For example, a value of zero for the displacement vector restriction information disparity _ restriction indicates that neither the Y component dvp _ Y of the prediction vector nor the Y component dvd _ Y of the difference vector is restricted. A value of 1 for the displacement vector restriction information disparity _ restriction indicates a value that restricts the Y component dvp _ Y of the prediction vector to a predetermined range, and does not restrict the value of the Y component dvd _ Y of the difference vector. A value of 2 for the displacement vector restriction information disparity _ restriction indicates that the Y component dvp _ Y of the prediction vector is not restricted, and the value of the Y component dvd _ Y of the difference vector is restricted to a predetermined range of values. A value of the displacement vector restriction information disparity _ restriction of 3 means that the value of the Y component dvp _ Y of the prediction vector is restricted to a value of a predetermined range and the value of the Y component dvd _ Y of the difference vector is restricted to a value of a predetermined range.
In particular, regarding the restriction of the Y component of the differential vector, when the value of the predetermined range is set to zero and the value of the displacement vector restriction information disparity _ restriction is zero or 1, the differential vector dvdLX is a two-dimensional vector including the X component dvd _ X and the Y component dvd _ Y as elements. When the value of the displacement vector restriction information disparity _ restriction is 2 or 3, the difference vector dvdLX is a scalar value including the X component dvd _ X as an element and not including the Y component dvd _ Y as an element.
The entropy decoding unit 301 of the image decoding device 31b separates the displacement vector restriction information disparity _ restriction from the bit stream Te, and outputs the separated displacement vector restriction information disparity _ restriction to the extrinsic prediction parameter extraction unit 3031 (see fig. 3 and 4). Further, the external prediction parameter extraction unit 3031 outputs the displacement vector restriction information disparity _ restriction to the displacement prediction vector limiter unit 30321a (fig. 5).
The displacement vector restriction information disparity _ restriction input from the entropy decoding unit 301 is input to the vector difference decoding unit 30312 (fig. 4) of the external prediction parameter extraction unit 3031 (fig. 3).
When the value of the displacement vector restriction information disparity _ restriction is zero or 1, the difference vector mvdLX input from the entropy decoding unit 301 is output to the adding unit 3035.
When the value of the displacement vector restriction information disparity _ restriction is 2 or 3, the vector differential decoding unit 30312 restricts the Y component dvd _ Y to a value in a predetermined range. The vector difference decoding unit 30312 combines the X component dvd _ X of the input difference vector dvdLX and the Y component dvd _ Y whose value is limited to a predetermined range of values, and reconstructs the difference vector dvdLX. The vector difference decoding unit 30312 outputs the reconstructed difference vector dvdLX to the adding unit 3035 (fig. 3).
The displacement vector limiter 30321a (fig. 5) receives displacement vector limit information disparity _ restriction as input from the external prediction parameter extractor 3031. When the value of the displacement prediction vector restriction information disparity _ restriction is zero or 2, the displacement prediction vector limiter unit 30321a outputs the prediction vector dvpLX input from the prediction vector selection unit 3034 to the adder unit 3035. When the value of the displacement vector restriction information displacement _ restriction is 1 or 3, the displacement prediction vector limiter 30321a restricts the Y component dvp _ Y to a value in a predetermined range. The displacement prediction vector limiter 30321a reconstructs the prediction vector dvpLX from the X component dvp _ X and the Y component dvp _ Y, which limits the value to a predetermined range of values, and outputs the reconstructed prediction vector dvpLX to the adder 3035.
In modification D2, the reference layer determination unit 30311 (see fig. 4) may be omitted from the external prediction parameter extraction unit 3031.
In addition, in the encoded stream Te, the displacement vector restriction information disparity _ restriction may be included in each layer image (parallax image) or each sequence. Coding parameters for a sequence of all layers (views) included in a certain stream are referred to as View Parameter Sets (VPS). The Set of coding parameters for each Sequence is referred to as a Sequence Parameter Set (SPS). The sequence is a group of a plurality of images in a period from one reset (reset) to the next reset in the decoding process.
The above-described image decoding device 31b includes the reference layer determination unit 30311 that determines whether or not the vector or the vector difference of the target block is an outer layer prediction that is a prediction between different layers, decodes the displacement vector restriction information disparity _ restriction among parameters higher than the target block, and derives as a predetermined value (for example, 0) without decoding at least one component of the vector of the target block from the encoded data when the reference layer determination unit 30311 determines that the outer layer prediction is performed and when the displacement vector restriction information disparity _ restriction is a predetermined value. When the reference layer determination unit 30311 determines that the target block is the outer layer prediction, at least one component of the difference vector of the target block is not decoded from the encoded data and is derived as a predetermined value (for example, 0).
Thus, in modification D2, whether or not the displacement in the Y direction is used can be switched according to the arrangement of the viewpoint (camera) or the like from which the image is acquired, or the scene (scene) of the captured image. For example, in an image in which an edge mainly in the vertical direction appears (for example, an image of a ferrite lattice extending in the vertical direction), a displacement (parallax) in the Y direction may not be perceived. Therefore, it is possible to switch whether or not the Y-direction displacement is not used, depending on whether or not the images captured by the plurality of cameras arranged at positions deviated from the straight line extending in the horizontal direction are scenes in which mainly vertical edges appear. Therefore, it is possible to achieve both suppression of degradation of image quality and reduction of the information amount of the bit stream Te as a whole scene.
(modification D3)
Next, another modification D3 of the present embodiment will be described.
The image decoding apparatus 31c according to modification D3 includes an extrinsic prediction parameter decoding unit 303c instead of the extrinsic prediction parameter decoding unit 303 (fig. 3) of the image decoding apparatus 31 (fig. 1 and 2). The configuration of the image decoding device 31c according to modification D3 is the same as the configuration of the image decoding device 31 except for the external prediction parameter decoding unit 303 c. Therefore, the same reference numerals are given to the same structures, and the above description is applied.
When the reference layer information reference _ layer _ info is 1, that is, the vector mvLX is a displacement vector (subjected to outer layer prediction), the vector differential decoding unit 30312 decodes a component value (X component) dvd _ X for the horizontal direction (X direction) from the coded stream Te, and determines a component value (Y component) dvd _ Y for the vertical direction (Y direction) to be a predetermined value, for example, zero. The vector differential decoding unit 30312 combines the X component dvd _ X and the Y component dvd _ Y to form a differential vector dvdLX, and decodes the differential vector dvdLX.
On the other hand, when the vector mvLX is a motion vector, the vector differential decoding unit 30312 decodes the X component mvd _ X and the Y component mvd _ Y from the coded stream Te and outputs them as a differential vector mvdLX. The vector difference decoding unit 30312 outputs the decoded difference vector mvdLX (or dvdLX) to the addition unit 3035.
When the difference vector mvdLX is a difference vector of a displacement vector, the extrinsic prediction parameter decoding unit 303 in modification D3 decodes the X component dvd _ X from the coded stream Te and derives the Y component dvd _ Y by a predetermined rule.
On the other hand, when the vector mvLX is a motion vector, the vector differential decoding unit 30312 decodes the X component mvd _ X and the Y component mvd _ Y from the coded stream Te and outputs them as a differential vector mvdLX. The vector difference decoding unit 30312 outputs the decoded difference vector mvdLX (or dvdLX) to the addition unit 3035.
Even when the difference vector mvdLX is a difference vector relating to a displacement vector, the difference vector mvdLX may be a two-dimensional vector in which the value of the Y component mvd _ Y is always zero, for example.
The encoded stream Te input to the entropy decoding unit 301 (see fig. 2) includes a coefficient indicating the relationship between the X component dvd _ X and the Y component dvdLX _ Y of the displacement vector. Among such coefficients, there are a slope coefficient inter _ view _ grad and a slice coefficient inter _ view _ offset. The slope coefficient inter _ view _ grad is a coefficient indicating the amount of change in the Y component dvd _ Y with respect to the change in the X component dvd _ X. The slice coefficient inter _ view _ offset is a coefficient indicating the value of the Y component dvdLX _ Y in the case where the X component dvd _ X is zero.
Fig. 6 is a schematic diagram showing the configuration of the external prediction parameter decoding unit 303c according to this modification. The extrinsic prediction parameter decoding unit 303c further includes a displacement vector generation unit 3038 in the extrinsic prediction parameter decoding unit 303 (fig. 3) of the image decoding device 31 (fig. 2). The configuration of the extrinsic prediction parameter decoding unit 303c in modification D3 is the same as that of the extrinsic prediction parameter decoding unit 303 (fig. 3) except for the displacement vector generating unit 3038. Therefore, the same reference numerals are given to the same structures, and the above description is applied.
Here, the extrinsic prediction parameter extraction unit 3031 separates the slope coefficient inter _ view _ grad and the slice coefficient inter _ view _ offset from the symbol stream Te. The extrinsic prediction parameter extraction unit 3031 outputs the separated slope coefficient inter _ view _ grad and slice coefficient inter _ view _ offset to the shift vector generation unit 3038. The extrinsic prediction parameter extraction unit 3031 can be applied even when the symbol stream Te does not include the slope coefficient inter _ view _ grad but includes the slice coefficient inter _ view _ offset. In this case, the extrinsic prediction parameter extraction unit 3031 separates the slice coefficient inter _ view _ offset from the symbol stream Te and outputs the slice coefficient inter _ view _ offset to the displacement vector generation unit 3038.
The displacement vector dvLX output from the adder 3035 to the displacement vector generator 3038 is a scalar value that includes the X component dv _ X as an element and does not include the Y component dv _ Y. The displacement vector dvLX may be a vector of fixed values (or indefinite values) in which the value of the Y component dv _ Y is zero.
Next, the configuration of the displacement vector generator 3038 will be described.
Fig. 7 is a schematic diagram showing the configuration of the displacement vector generator 3038 according to this modification.
The displacement vector generator 3038 includes a reference layer determination unit 30381 and a displacement vector setting unit 30382.
As in the case of the reference layer determination unit 30311 (see fig. 4), the reference layer determination unit 30381 determines whether the vector mvLX is an outer layer prediction (displacement vector) or a motion vector based on the reference picture index refIdxLX input from the outer prediction parameter extraction unit 3031. In the reference layer determination unit 30381, the vector mvLX generates reference layer information reference _ layer _ info. The reference layer determination unit 30381 outputs the generated reference layer information reference _ layer _ info to the displacement vector setting unit 30382.
The vector mvLX is input from the adder 3035 to the displacement vector setting portion 30382, and the slope coefficient inter _ view _ grad and the slice coefficient inter _ view _ offset are input from the external prediction parameter extractor 3031.
When the reference layer information reference _ layer _ info indicates that the vector mvLX is an outer layer prediction (displacement vector), the displacement vector setting unit 30382 calculates the Y component dv _ Y thereof based on the X component dv _ X, the slope coefficient inter _ view _ grad, and the slice coefficient inter _ view _ offset thereof. Here, the displacement vector setting unit 30382 uses an equation dv _ y equal to inter _ view _ grad · dv _ x + inter _ view _ offset. When the symbol stream Te includes no slope coefficient inter _ view _ grad but includes a slice coefficient inter _ view _ offset, the displacement vector setting unit 30382 uses an equation of dv _ y being inter _ view _ offset. The displacement vector setting unit 30382 may also perform a process of deriving an intermediate value of the Y component dv _ Y from the sum of the absolute value of the X component dv _ X, the slope coefficient inter _ view _ grad, and the slice coefficient inter _ view _ offset, and further changing the sign of the intermediate value of the Y component dv _ Y so as to match the sign of the X component dv _ X.
The displacement vector setting unit 30382 combines the input X component dv _ X and the calculated Y component dv _ Y to form a two-dimensional displacement vector dvLX, stores the formed displacement vector dvLX in the prediction parameter memory 307, and outputs the result to the predicted image generating unit 308. In addition, inter _ view _ grad and inter _ view _ offset are integers.
When the Y component dv _ Y is quantized in the predetermined quantization step GRAD, the displacement vector setting unit 30382 may calculate the Y component dv _ Y using the following expression.
dv_y=inter_view_grad/dv_x/GRAD+inter_view_offset
In the case where the value of the quantization step GRAD is 2 to the GSHIFT power (GRAD 1 < GSHIFT), the displacement vector setting unit 30382 may use the following equation.
dv_y=floor{(inter_view_grad+ROUND+dv_x>>GSHIFT)+inter_view_offset}
Here, the GSHIFT value is a predetermined shift value (integer) greater than 1. 1 < GSHIFT represents an integer to the power of GSHIFT of 2. floor { … } represents a floor function that provides the integer portion of real …. … > -is a bit shift operator representing an operation of shifting the bit value of … lower to lower bits. That is, … > GSHIFT indicates the power of GSHIFT by dividing … by 2. ROUND is a constant used to implement the rounding division. ROUND has a value of 1 < (GSHIFT-1), a value that is half of the GSHIFT power of 2 as the divisor.
When the reference layer information reference _ layer _ info indicates that the vector mvLX is a motion vector, the displacement vector setting unit 30382 stores the motion vector mvLX inputted from the adder 3035 in the prediction parameter memory 307 as it is, and outputs the motion vector mvLX to the predicted image generating unit 308.
In addition, in the coded stream Te, the slope coefficient inter _ view _ grad and the slice coefficient inter _ view _ offset may be included in each sequence or in each picture. That is, a Set of coding parameters for each sequence is referred to as a sequence Parameter Set, and a Set of coding parameters for each Picture is referred to as a Picture Parameter Set (PPS). The parameter may be included in a picture header that is a set of parameters for each picture. In particular, by including these coefficients in each image, it is possible to follow extremely fine variations in the image unit of the scene.
The above-described image decoding device 31c includes the reference layer determination unit 30381 that determines whether or not the vector or the vector difference of the target block is an outer layer prediction that is a prediction between different layers, and when the reference layer determination unit 30381 determines that the outer layer prediction is performed, derives at least one component of the vector of the target block based on the other component of the target block and a value decoded from a parameter higher than the target block or a value decoded from a parameter higher than the target block. When the reference layer determination unit 30381 determines that the outer layer prediction is performed, at least one component of the difference vector of the target block is derived as a predetermined value (for example, 0) without being decoded from the encoded data.
Thus, in modification D3, even when the displacement in the Y direction does not become a predetermined value (for example, zero) at all when the arrangement or orientation of the viewpoint (camera) or the like from which the image is obtained is not exactly arranged in parallel in one direction, the displacement in the Y direction can be reproduced. In the present modification, the relationship between the displacement in the X direction and the displacement in the Y direction can be switched according to the scene of the captured image. As a whole scene, both the suppression of the degradation of the image quality and the reduction of the information amount of the bit stream Te can be achieved.
(modification D4)
Next, another modification D4 of the present embodiment will be described. The same reference numerals are given to the same structures, and the above description is applied.
The image decoding apparatus 31d of the present modification includes an extrinsic prediction parameter decoding unit 303d instead of the extrinsic prediction parameter decoding unit 303 (fig. 3) of the image decoding apparatus 31. The outline of the other configuration of the external prediction parameter decoding unit 303D according to the modification D4 is the same as the configuration of the external prediction parameter decoding unit 303 (fig. 3). Therefore, the same reference numerals are given to the same structures, and the above description is applied.
Fig. 8 is a schematic diagram showing the configuration of the external prediction parameter decoding unit 303d according to this modification.
The outer prediction parameter decoding unit 303d includes an outer prediction parameter extraction unit 3031, an AMVP prediction parameter derivation unit 3032, an addition unit 3035, a merged prediction parameter derivation unit 3036, and a displacement vector limiter unit (displacement vector limiter unit) 3037.
The displacement vector limiter unit 3037 stores the reference image index refIdxLX input from the external prediction parameter extractor unit 3031 in the prediction parameter memory 307, and outputs the reference image index refIdxLX to the predicted image generator unit 308. In addition, when the vector mvLX inputted from the adder 3035 is an outer layer prediction (displacement vector), that is, when the reference picture index refIdx is inputted and the reference layer determination unit 30371 determines that the outer layer prediction is performed, the displacement vector limiter 3037 limits the range of the value of the vector to a value within a predetermined range and outputs the displacement vector dvLX whose value is limited to the predicted picture generation unit 308.
The displacement vector limiter unit 3037 may limit the range of the displacement vector dvLX inputted from the merged prediction parameter derivation unit 3036 to a value within a predetermined range, in the same manner as the displacement vector dvLX inputted from the adder unit 3035.
Fig. 9 is a schematic diagram showing the configuration of the displacement vector limiter 3037 according to modification D4.
The displacement vector limiter 3037 includes a reference layer determination unit 30371 and a vector limiter 30372.
As with the reference layer determination unit 30311 (see fig. 4), the reference layer determination unit 30371 determines whether the vector mvLX is a displacement vector (disparity vector, subjected to the outer layer prediction) or a motion vector based on the reference picture index refIdxLX input from the entropy decoding unit 301.
The vector limiter 30372 receives the vector mvLX from the adder 3035 or the merged prediction parameter derivation 3036, and receives the reference layer information reference _ layer _ info from the reference layer determination unit 30371.
When the reference layer information reference _ layer _ info indicates that the vector mvLX is a displacement vector (subjected to the outer layer prediction), the vector limiter 30372 limits the value of the displacement vector dvLX to a value within a predetermined range. The restriction of the displacement vector may also be performed for one of the X component, the Y component, or for both the X component and the Y component.
For example, when the value of the X component dv _ X of the displacement vector dvLX is larger than a predetermined upper limit value of the X component, the vector limiter 30372 determines the value of the X component dv _ X as the upper limit value of the X component. When the value of the X component dv _ X is smaller than a predetermined lower limit value of the X component, the vector limiter 30372 determines the value of the X component dv _ X as the lower limit value of the X component.
When the value of the Y component dv _ Y is larger than the predetermined upper limit value of the Y component, the vector limiter 30372 determines the value of the Y component dv _ Y as the upper limit value of the Y component. When the value of the Y component dv _ Y is smaller than the predetermined lower limit value of the Y component, the vector limiter 30372 determines the value of the Y component dv _ Y as the lower limit value of the Y component.
In addition, considering a case where a plurality of layer images are processed in parallel as described later, it is effective that the upper limit value of the Y component is, for example, 28, 56, or the like. The range of the displacement vector is, for example, about 10% of the width of the screen. The upper limit value of the X component is, for example, 10% of the width of the picture or 1 in 8. The upper limit value of the X component may be 1/16 of the screen width.
In the above example, the value of the X component dv _ X or the Y component dv _ Y is determined as the upper limit value or the lower limit value when the value exceeds a predetermined value, but the present embodiment is not limited to this. For example, the vector limiter 30372 may limit the value based on the remaining value of the domain width (i.e., the difference between the upper limit value and the lower limit value) for each of the X component dv _ X and the Y component dv _ Y. For example, the domain width at the X component is 2k(k is an integer greater than 0, e.g., 7), dv _ x may be limited to dv _ x% 2k-2k(in the case where dv _ x. gtoreq.2k-1), dv _ x% 2k(dv _ x is 0. ltoreq. dv _ x < 2k-1In the case of (1). Here, …% indicates the remaining value obtained by dividing variable … by the variable. Thus, the value of the X component dv _ X is limited to a minimum value of-2 k-1 to a maximum value of 2 k-1-1.
Further, since the value of the difference vector dvLX with respect to the displacement vector dvLX is also limited, the symbol amount when encoding the difference vector dvLX can be reduced.
In addition, in each of the above examples, the domain width of the Y component may be reduced to be smaller than the domain width of the X component. For example, when the domain width of the X component is 2048, the domain width of the Y component is 64. That is, by reducing the range of the value of the Y component to a range smaller than the value of the X component, the amount of information relating to the Y component can be reduced.
The vector limiter 30372 outputs the value-limited displacement vector dvLX to the predicted image generator 308.
When the reference layer information reference _ layer _ info indicates that the vector mvLX is not an external layer prediction (motion vector), the vector limiter 30372 outputs the input vector mvLX to the predicted image generator 308 as it is.
In the present modification, the reference image block may be read from the external predicted image generation unit 309 via the internal memory 3061 by including the internal memory 3061 (flash memory). The internal memory 3061 may be a storage medium having a smaller storage capacity and a higher access speed than the reference image memory 306.
Fig. 10 is a conceptual diagram illustrating an example of a configuration in which the external predicted image generation unit 309 reads a reference image block in modification D4.
The rectangular area of the reference image memory 306 shown in fig. 10 includes a reference area 1. The reference area 1 is a reference area of the displacement vector before the limit of the value that is acceptable in the present modification D4. The reference area 1 includes a reference area 2. The reference area 2 is a reference area of the displacement vector whose value is limited in the modification D4.
The internal memory 3061 includes a reference area 3. The reference area 3 is an area where the reference image stored in the reference area 2 of the reference image memory 306 is temporarily stored. The external predicted image generating unit 309 includes a displacement predicted image generating unit 3091.
The displacement predicted image generation unit 3091 reads out, from the internal memory 3061, a reference image block located at a position indicated by the displacement vector mvLX with respect to the decoding target block. Here, the area where the reference picture block is read out becomes the reference area 3. Since the reference region 3 only needs to cover a range of a value that can be assumed for the displacement vector mvLX, the storage capacity of the internal memory 3061 can be reduced in the present modification D4, and the image decoding apparatus can be realized at low cost. The displacement prediction image generation unit 3091 performs prediction processing on the read reference image block to generate a displacement compensated prediction image as the prediction image block P.
Fig. 23 is a diagram illustrating the necessity of limiting a displacement vector.
As shown in fig. 23(a), when decoding a moving picture composed of a plurality of layer pictures, in order to use a device having a limited decoding speed, the decoding is completed within a predetermined time (16 ms in this example), and a plurality of layer pictures (layer 0 and layer 1 in this example) may be decoded in parallel.
In scalable encoding, since layer 1 is encoded depending on layer 0, it cannot be decoded completely in parallel, but for example, in the case of decoding in units of blocks, layer 1 can be decoded later than layer 0 by a predetermined block line (block line).
Fig. 23 b shows an example of decoding layer 1 coded Block line (1CTB [ Coding Tree Block ] line) later than layer 0 (shown above in fig. 23 b). In fig. 23 b, an arrow indicates a decoded region (decoding thread) of each layer image. The front end of the arrow indicates the right end of the block currently undergoing decoding processing. The portion of the screen shown represents an (Available) area in which the layer image 1 can be currently used as a reference image in decoding the layer image 1, among areas subjected to decoding processing in decoding processing. The vertical stripe portion indicates an (unavailable) area of the layer picture 1, which is currently unavailable as a reference picture in decoding the layer picture 0, among areas subjected to decoding processing in decoding processing. The unavailability is due to processing delays in the decoding process, such as Deblocking filter processing (DF) and Sample Adaptive offset Filtering (SAO) (because DF + SAO).
Here, as the range of the displacement vector indicating the region of the image of the layer 0 that can be referred to from the layer 1, for example, the restriction shown in the following formula may be applied.
The upper limit of the Y coordinate of the displacement vector is equal to the height of the coding block x N-LFH-MCH
The upper limit of the X coordinate of the displacement vector is equal to the width of the coding block X M-LFW-MCW
Here, N is the number of code block lines (rows) to be delayed, M is the number of code columns (columns) to be delayed, LFW, LFH are the width and height of the range relating to the loop filter, and MCW and MCH are the width and height of the range relating to the motion compensation filter. In general, LFW and LFH are 4 in which 3, which is the range of the deblocking filter, and 1, which is the range of the adaptive offset filter, are added, but the present invention is not limited to this. For example, when the adaptive offset filter is not applied to the reference layer, only 3(LFH ═ LFW ═ 3) of the deblocking filter may be used, and when the loop filter is not applied to the reference layer, 0(LFH ═ LFW ═ 0) may be used. The MCW and the MCH are 4 when the number of taps of the motion compensation filter is 8, and 3 when the number of taps is 6.
For example, when the size of the coding block is 32, the range of the displacement vector is as follows when the delay of 1 line is set.
The upper limit of the Y coordinate of the displacement vector is 32 × 1-4-28
Here, the upper limit of the Y coordinate of the displacement vector may be 28 or more. For example, when the size of the coding block is 64 and the delay of 1 line is set, the following value may be determined as the range of the displacement vector.
The upper limit of the Y coordinate of the displacement vector is 64 × 1-4-56
Here, the upper limit of the Y coordinate of the displacement vector may be 56 or more.
The above-described image decoding device 31d includes a prediction parameter decoding unit (an external prediction parameter decoding unit 303d) that derives a vector of a target block from the sum of a vector difference decoded from encoded data and a prediction vector, a prediction image generating unit (an external prediction image generating unit 309) that reads a reference image of a region indicated by the vector derived by the vector deriving unit from the reference image storage unit and generates a prediction image based on the read reference image, a reference layer determining unit 30371 that determines whether or not the vector of the target block is external layer prediction that is prediction between different layers, and a vector clipping unit (a vector clipping unit 30372) that limits at least one component of the vector of the target block to a predetermined range when the reference layer determining unit 30371 determines that the vector is external layer prediction.
In this way, in modification D4, the range in which the value of the displacement vector can be reduced. This range corresponds to a reference area that is a range to be referred to when the external predicted image generation unit 309 refers to a reference image of a layer other than the decoding target (target layer image) from the reference image memory 306 with respect to a certain decoding target block. That is, in the present modification, by reducing the reference area in the reference picture memory 306 of the layer picture different from the layer to be decoded, the use as the reference picture memory 306 is allowed even in a storage medium of a small capacity, and the processing can be speeded up.
Further, as described with reference to fig. 23, in the case where decoding of a layer picture (for example, base view) different from the decoding target picture and decoding of the decoding target picture are performed in parallel, since the reference range of a layer picture different from the decoding target picture to be referred to from the decoding target picture is limited, decoding of the decoding target picture can be performed simultaneously before the layer picture different from the decoding target picture is completely decoded.
(modification D5)
Next, another modification D5 of the present embodiment will be described. The same reference numerals are given to the same structures, and the above description is applied.
The image decoding apparatus 31e according to modification D5 includes an extended vector candidate derivation unit 30335e instead of the extended vector candidate derivation unit 30335 (see fig. 5) of the image decoding apparatus 31 c.
The image decoding apparatus 31e may or may not include the displacement vector limiter unit 3037 (fig. 8) in the outer prediction parameter decoding unit 303.
Fig. 11 is a schematic diagram showing the configuration of the extended vector candidate derivation unit 30335e according to modification D5.
The extended vector candidate derivation unit 30335e includes a displacement vector limiter unit (displacement vector limiter unit) 30338 in addition to the displacement vector acquisition unit 30336 and the outer layer vector candidate derivation unit 30337.
As described above, the outer layer vector candidate derivation unit 30337 outputs the displacement vector derived by the displacement vector acquisition unit 30336 when the reference image is a layer image different from the target image, specifies a layer image different from the target image (here, a layer image of the reference viewpoint) when the reference image is the same layer image as the target image, derives the displacement vector for the specified image from the displacement vector acquisition unit 30336, and reads the vector of the layer image located at the position corresponding to the displacement vector (the layer image of the reference viewpoint) from the prediction parameter memory 307.
The displacement vector limiter unit 30338 according to modification D5 limits the reference region when the prediction parameter of a layer image other than the target image is read from the prediction parameter memory 307 by the outer layer vector candidate derivation unit 30337. The range of the value of the displacement vector dvLX inputted from the displacement vector acquisition unit 30336 is limited to a value within a predetermined range. The process of limiting the value of the displacement vector dvLX by the displacement vector limiter 30338 may be the same as the displacement vector limiter 30372.
The displacement vector limiter 30338 limits the reference region by the following expression, for example.
dvLX[0]′=Clip3(-DataRangeX,DataRangeX-1,dvLX[0])
dvLX[1]′=Clip3(-DataRangeY,DataRangeY-1,dvLX[1])
Here, dvLX [0] and dvLX [1] are displacement vectors input from the displacement vector acquisition unit 30336, and dvLX [0] 'and dvLX [1 ]' are output displacement vectors. Datarange x, datarange y are predetermined constants indicating the limit range. Clip3(x, y, z) is a function that limits z to x or more and y or less.
The displacement vector limiter 30338 may limit the reference region using the following expression.
dvLX[0]′=Clip3(-DataRangeX,DataRangeX,dvLX[0])
dvLX[1]′=Clip3(-DataRangeY,DataRangeY,dvLX[1])
The displacement vector limiter unit 30338 outputs each of the displacement vectors dvLX whose values are limited to the outer layer vector candidate derivation unit 30337.
In this case, for example, the outer layer vector candidate derivation unit 30337 specifies the position of a block, which is located at a position shifted from the start point by the displacement vector dvLX limited by the displacement vector limiter unit 30338, with respect to the specified image, using the following expression, using the block to be decoded as the start point.
xRef=xP+((nPSW-1)>>1)+((dvLX[0]′+2)>>2)
yRef=yP+((nPSH-1)>>1)+((dvLX[1]′+2)>>2)
Here, xRef and yRef are coordinates of corresponding blocks, xP and yP are coordinates of upper left coordinates of each decoding target block, nPSW and nPSH are widths and heights of each decoding target block, and dvLX [0] ', dvLX [1 ]' are X and Y components of the displacement vector input from the displacement vector acquisition unit 30338. Further, xP + ((nPSW-1) > >1) and yP + ((nPSH-1) > >1) indicate that the center of the decoding target block is used as the starting point of the decoding target block, but other positions such as the upper left or lower right of the decoding target block may be used.
When the Y component of the displacement vector is limited to 0, the outer layer vector candidate derivation unit 30337 may specify the position of the corresponding block with the displacement vector associated with the Y component as 0 using the following expression.
xRef=xP+((nPSW-1)>>1)+((dvLX[0]′+2)>>2)
yRef=yP+((nPSH-1)>>1)
In this case, the X component of the corresponding block corresponds to a position shifted by the displacement vector from the X component starting from the decoding target block, but the Y component of the corresponding block coincides with the Y component starting from the decoding target block.
Thus, the outer layer vector candidate derivation unit 30337 limits the reference region for reading the prediction parameters such as the vector mvLX in the block that has been processed in the layer image other than the target layer image (for example, the target layer image is a non-reference viewpoint image and the reference layer image is a reference viewpoint image) from the prediction parameter memory 307. This is because the reference area corresponds to a range in which the value of the displacement vector dvLX can be taken as the starting point of the decoding target block.
That is, in modification D5, by reducing the reference area in the prediction parameter memory 307, even if the storage medium has a small capacity, the use as the prediction parameter memory 307 is allowed, and the processing can be speeded up. Here, the present modification D5 may include an internal memory 3076, and the outer-layer vector candidate derivation unit 30337 may read the prediction parameters via the internal memory 3076. The internal memory 3076 may be a storage medium having a smaller storage capacity and a faster access speed than the predicted parameter memory 307.
Fig. 12 is a conceptual diagram illustrating an example of a configuration for reading prediction parameters in modification D5.
The rectangular area of the prediction parameter memory 307 shown in fig. 12 includes a reference area 4 and a reference area 5. The reference region 5 is a reference region of the prediction parameters of the decoding target image (target layer image), that is, a region in which the prediction parameters that the displacement vector acquisition unit 30336 wants to read are stored. Since the prediction parameters read by the displacement vector acquisition unit 30336 are limited to those for blocks in a range predetermined from the decoding target block in the same image as the decoding target image, the reference region 5 is limited to a range corresponding to the number of blocks. The reference region 4 is a reference region for prediction parameters of a layer image (for example, an image of a base view) different from the decoding target, that is, a region in which the prediction parameters to be read by the outer layer vector candidate derivation unit 30337 are stored. The reference area 4 is a range of values that can be used to displace the vector dvLX from the decoding target block as a starting point. In the present modification D5, the range of values that can be obtained by the displacement vector limiter 30338 for the displacement vector is limited, and therefore the reference region 4 is limited.
The above-described image decoding device 31e includes a displacement vector deriving unit (displacement vector acquiring unit 30336) for deriving a displacement vector of a target block, a prediction parameter decoding unit (extended vector candidate deriving unit 30335e) for deriving a prediction vector of the target block at a position shifted by the displacement vector from the target block of the vector included in the prediction parameter storage unit that has already been decoded, a prediction image generating unit (external prediction image generating unit 309) for reading a reference image of a region indicated by the vector derived by the vector deriving unit from the reference image storage unit and generating a prediction image based on the read reference image, a reference layer determining unit 30371 for determining whether or not the vector of the target block is external layer prediction which is prediction between different layers, and a vector clipping unit (displacement vector clipping unit) for clipping at least one component of the vector of the displacement block to a predetermined range when the reference layer determining unit 30371 determines that the external layer is external layer prediction Magnitude limiter 30338).
The prediction parameters such as the motion vector mvLX in the layer image of the decoding target image stored in the reference area 4 of the prediction parameter memory 307 are temporarily stored in the internal memory 3076, and the external layer vector candidate derivation unit 30337 reads the prediction parameters from the internal memory 3076. Since the reference region 4 only needs to cover the range of the value of the displacement vector dvLX, the storage capacity of the internal memory 3076 can be reduced in the present modification D5, and the image decoding apparatus can be realized at low cost.
The image decoding device 31e according to modification D5 further includes a merge prediction parameter derivation unit 3036e described below, instead of the merge prediction parameter derivation unit 3036 (see fig. 6) of the image decoding device 31 c.
Fig. 13 is a schematic diagram showing the configuration of the merging prediction parameter derivation unit 3036e according to modification D5.
The merge prediction parameter derivation unit 3036e includes a merge candidate derivation unit and a merge candidate selection unit, which are not shown. A merge candidate derivation unit derives a merge candidate list composed of a plurality of merge candidates. The merge candidate selection unit selects a merge candidate indicated by the merge index merge _ idx from the merge candidate list. The merge candidate derivation unit includes a basic merge candidate derivation unit and an extended merge candidate derivation unit 30360, which are not shown. A basic merge candidate derivation unit derives, as a merge candidate, a prediction parameter of a reference block located within a predetermined range from a block to be decoded. The reference block is, for example, a block that is contiguous to at least one of the lower left end, the upper left end, and the upper right end of the decoding target block. The extended merging candidate is a merging candidate derived using a prediction parameter of a reference layer image of a layer different from the decoding target image (target layer image). The extended merging candidate derivation unit 30360 included in the merging prediction parameter derivation unit 3036e includes a displacement vector acquisition unit 30361, a displacement vector limiter unit 30338, and an outer layer merging candidate derivation unit 30363.
The displacement vector acquisition unit 30361 reads the displacement vector dvLX from the prediction parameter memory 307, from a block adjacent to the target block. The displacement vector acquisition unit 30361 outputs the read displacement vector dvLX to the displacement vector limiter unit 30338.
The displacement vector limiter unit 30338 limits the range of the value of each of the displacement vectors dvLX input from the displacement vector acquisition unit 30361 to a value within a predetermined range. The above description is applied to the process of limiting the value of the displacement vector dvLX by the displacement vector limiter 30338.
The external view layer candidate derivation section 30363 determines the reference picture index refIdx corresponding to a layer picture (for example, base view) different from the layer picture of the decoding target picture.
The outer-layer merging candidate derivation section 30363 specifies the image of the layer image (for example, base view angle) corresponding to the read reference image index refIdx. The outer-layer merging candidate derivation section 30363 specifies, with respect to the specified image, the corresponding blocks located at positions shifted from the start points by the displacement vectors dvLX input from the displacement vector limiter section 30338, respectively, with the decoding target block as the start point. The external layer merge candidate derivation unit 30363 reads the prediction parameters for the corresponding block, stores the read prediction parameters in the prediction parameter memory 307, and outputs the prediction parameters to the predicted image generation unit 308.
Thus, the outer layer merging candidate derivation unit 30363 limits the reference region, which is a range in which the range of the block of the prediction parameter can be read and which can be used as the value of the displacement vector dvLX starting from the decoding target block.
That is, in modification D5, by reducing the reference area in the prediction parameter memory 307 in a layer image different from the decoding target image, it is possible to use the prediction parameter memory 307 even in a storage medium having a small capacity, and to realize a high speed processing. Note that, in modification D5, the external layer merging candidate derivation unit 30363 may include an internal memory 3072 (not shown), and read the prediction parameters via the internal memory 3072. The internal memory 3072 may be a storage medium having a smaller storage capacity and a faster access speed than the predicted parameter memory 307. This makes it possible to realize an image decoding device at low cost.
In addition, in the extended vector candidate derivation unit 30335e and the outer-layer merging candidate derivation unit 30363 according to modification D5, as described with reference to fig. 23, when decoding of a layer picture (for example, base view) different from the decoding target picture and decoding of the decoding target picture are operated in parallel, since the reference region of the prediction parameter in the prediction parameter memory 307 of a layer picture different from the decoding target picture to be referred to from the decoding target picture is limited, decoding of the decoding target picture can be performed before the layer picture (for example, base view) different from the decoding target picture is completely decoded (even when all prediction parameters in the prediction parameter memory 307 of the different layer picture cannot be referred to).
(modification D6)
Next, another modification D6 of the present embodiment will be described.
The image decoding device 31f according to modification D6 includes an entropy decoding unit 301f instead of the entropy decoding unit 301 (fig. 3) of the image decoding device 31 (fig. 1 and 2). The image decoding device 31f according to modification D6 has the same configuration as the image decoding device 31 except for the entropy decoding unit 301 f. Therefore, the same reference numerals are given to the same structures, and the above description is applied.
Before the description of the present modification D6, the configuration of the entropy decoding unit 301 will be described.
Fig. 24 is a block diagram showing the structure of the entropy decoding unit 301.
The entropy decoding unit 301 includes an arithmetic symbol decoding unit 3011 and a vector difference syntax decoding unit 3013.
The arithmetic code decoding unit 3011 refers to the context variable DV and decodes each bit included in the coded stream Te. The arithmetic symbol decoding unit 3011 includes a context record update unit 30111 and a bit decoding unit 30112.
The context record update unit 30111 records and updates the context variables CV managed in association with the context indexes ctxIdxInc. Here, the context variable CV includes (1) a dominant symbol MPS (most probable symbol) having a high probability of occurrence and (2) a probability state index pStateIdx specifying the probability of occurrence of the dominant symbol MPS.
The context record update unit 30111 updates the context variable CV by referring to the context index ctxIdxInc determined based on the transform coefficient obtained in the inverse quantization/inverse DCT311 and the value of Bin decoded by the bit decoding unit 30112, and records the updated context variable CV until the next update. Bin represents each bit of a bit string constituting the information. In addition, the value of the dominant symbol MPS is 0 or 1. Further, the dominant symbol MPS and the probability state index pStateIdx are updated every time the bit decoding unit 30112 decodes 1 Bin.
Further, the context index ctxIdxInc may be a value directly specifying the context or an added value from the base value.
The bit decoding unit 30112 refers to the context variable CV recorded in the context record updating unit 30111 and decodes each bit Bin included in the bit stream Te relating to the difference vector. The bit decoding unit 30112 may use a decoding scheme corresponding to CABAC (Context-based Adaptive binary arithmetic Coding). The bit decoding unit 30112 supplies the decoded value of Bin to the vector difference syntax decoding unit 3013. The decoded value of Bin is also supplied to the context record update unit 30111 and is referred to for updating the context variable CV.
The vector difference syntax decoding unit 3013 derives a context index ctxIdxInc for decoding each Bin of abs _ mvd _ size 0_ flag [ XY ], abs _ mvd _ size 1_ flag [ XY ], abs _ mvd _ minus2[ XY ], and mvd _ sign _ flag, which are syntax elements constituting a vector difference, and outputs the context index ctxIdxInc to the arithmetic symbol decoding unit 3011. These syntax elements are decoded by the arithmetic symbol decoding section 3011. abs _ mvd _ grease 0_ flag [ XY ], abs _ mvd _ grease 1_ flag [ XY ], abs _ mvd _ minus2[ XY ], and mvd _ sign _ flag [ XY ] are a flag indicating whether or not the absolute value of the difference vector exceeds 0 (corresponding to a flag indicating whether or not the difference vector is 0), a flag indicating whether or not the absolute value of the difference vector exceeds 1, and a flag indicating a value obtained by subtracting 2 from the absolute value of the difference vector, and mvd _ sign _ mvd is a flag indicating the sign of the difference vector, respectively. XY of the suffix is a variable having a value of 0 in the case of the X component and 1 in the case of the Y component.
The vector difference syntax decoding unit 3013 derives a context index ctxIdxInc based on the decoded reference layer information reference _ layer _ info. Here, the vector difference syntax decoding unit 3013 may store in advance derivation table information in which reference _ layer _ info and context index ctxIdxInc are associated with each other, and use the table information for deriving the context index ctxIdxInc. Next, an example of deriving table information will be described.
Fig. 26 is a diagram showing an example of the derived table information.
Fig. 26(a) shows, as Syntax elements, values of context indices of bins, which are abs _ mvd _ grease 0_ flag [ ] where the absolute value of the (Syntax element) difference vector exceeds 0. Fig. 26(a) shows that BinIdx is 0, that is, the context index of the 1 st bit is always 0. Further, the context index ctxIdxInc of abs _ mvd _ grease 0_ flag [ XY ] is 1-bit information whose other bits do not have a corresponding value.
In fig. 26(b), as a syntax element, with respect to abs _ mvd _ grease 0_ flag [ XY ], a context index ctxIdxInc of each reference layer information reference _ layer _ info, X component, Y component is represented. In the example shown in fig. 26(b), the value of the context index ctxIdxInc is always 0 regardless of the reference layer information reference _ layer _ info, the X component, and the Y component. In this example, the syntax element abs _ mvd _ header 0_ flag [ XY ] indicates whether a layer (reference layer) of a reference picture, which is not dependent on a vector to which the syntax element corresponds, is a base layer (base view) or an outer layer (outer view), or whether a syntax element of a difference vector to be an object is an X component or a Y component, and the same context is used for decoding.
In the CABAC described above, when the probability that Bin becomes 1 (or the probability that Bin becomes 0) differs depending on the condition, the context variable CV is used in order to effectively use the conditional probability and reduce the symbol amount. In this example, the context is derived without depending on whether the difference vector is an X component or a Y component, but it is assumed that the probability of whether the value of the syntax element of the difference vector is 0 by this component is greatly different.
It should be noted that the syntax element to be also targeted in the later-described example is not limited to the example shown in fig. 26, and the reference layer information reference _ layer _ info and the context index ctxIdxInc may be associated with other syntax elements, similarly to the abs _ mvd _ greater1_ flag [ XY ].
Next, the configuration of the entropy decoding unit 301f according to modification D6 will be described.
Fig. 25 is a block diagram showing the configuration of the entropy decoding unit 301f according to modification D6.
As shown in fig. 25, the entropy decoding unit 301f of modification D6 includes a reference index decoding unit 3012 and a reference layer determining unit 30371 in the entropy decoding unit 301 (fig. 24). Further, a vector difference syntax decoding unit 3013f is included instead of the vector difference syntax decoding unit 3013. The reference index decoding unit 3012 decodes the reference picture index refIdxLX from the coded stream Te, and outputs the decoded reference picture index to the reference layer determining unit 30371. As already described, the reference layer determining part 30371 determines the reference layer information reference _ layer _ info based on the reference picture index refIdxLX.
The vector difference syntax decoding unit 3013f determines the context index ctxIdxInc using any of the following derived tables based on the reference layer information reference _ layer _ info determined by the reference layer determination unit 30371.
Fig. 27 is a diagram showing another example of deriving table information.
Fig. 27(a) shows that 0 or 1 is taken as the range of values of the context index ctxIdxInc provided for the syntax element abs _ mvd _ great 0_ flag [ ].
Fig. 27(b) is derived table information representing a context index ctxIdxInc with respect to a syntax element abs _ mvd _ grease 0_ flag [ XY ]. As in fig. 26(b), this example also shows a method of determining the context index ctxIdxInc on the condition that the reference layer information reference _ layer _ info and the value of XY that distinguishes whether the syntax component abs _ mvd _ header 0_ flag [ XY ] of the decoding object is an X component or a Y component. In this example, when the reference layer information reference _ layer _ info is 0, that is, when the reference picture is the same layer as the target picture or the same view picture, and the difference vector is a motion vector, the same context index (ctxIdx ═ 0) is associated with both the X component and the Y component of the syntax element of the difference vector. When the reference layer information reference _ layer _ info is 1, that is, when the reference picture is a different layer from the target picture or a different view picture, and the difference vector is a displacement vector, the X component and the Y component of the syntax element of the difference vector are respectively associated with the context indices ctxIdx having different values.
In the example of fig. 27(b), ctxIdx is 0 in the X component of the displacement vector so as to have the same context as the motion vector, and ctxIdx is 1 in the Y component of the displacement vector so as to have a different context from the motion vector.
In general, when the reference picture is a view image different from the target picture (when the reference picture is an external view prediction in the external layer prediction), the Y component of the displacement vector indicating the displacement from the reference picture is concentrated to 0. In this case, the Y component of the difference vector of the displacement vector also tends to be concentrated to 0. That is, the probability that the Y component abs _ mvd _ greater0_ flag [1] of the flag abs _ mvd _ greater0_ flag [ XY ] indicating whether the absolute value of the differential vector exceeds 0 in the case of the outer layer prediction is 1 is significantly lower than the probability that the flag abs _ mvd _ greater0_ flag [ XY ] indicating whether the absolute value of the differential vector exceeds 0 in the case of the other, i.e., the same layer prediction (motion vector), and the flag abs _ mvd _ greater0_ flag [0] indicating whether the X component of the differential vector is 0 in the case of the outer layer prediction (displacement vector) is 1. In this way, considering the case where the probability of generating a value of Bin greatly differs depending on the reference image and the vector component, it is possible to obtain an effect of reducing the sign amount of the difference vector by assigning different contexts to the contexts of the X component of the difference vector of the same layer prediction and the outer layer prediction and the Y component of the difference vector of the outer layer prediction.
Fig. 28 is a diagram showing another example of deriving table information.
Fig. 28(a) shows that 0 or 1 is taken as the range of values of the context index ctxIdxInc provided for the syntax element abs _ mvd _ great 0_ flag [ ].
Fig. 28(b) is derived table information representing a context index ctxIdxInc with respect to a syntax element abs _ mvd _ grease 0_ flag [ XY ].
In this example, the reference layer information reference _ layer _ info is 0 and 1, respectively, and corresponds to different context indexes ctxIdxInc. Specifically, in the case of the same layer prediction (in the case of a motion vector), the same context (ctxIdxInc ═ 0) is assigned to both the X component and the Y component, and in the case of the outer layer prediction (in the case of a displacement vector), a context (ctxIdxInc ═ 1) different from that of the same layer prediction is assigned to both the X component and the Y component.
In general, when the reference image is another image (for example, a spatially scalable low-resolution image) which is the same viewpoint image as the target image, both the X component and the Y component of the displacement vector representing the displacement from the reference image are 0. In this case, the differential vector of the displacement vector also tends to be concentrated to 0. This means that the probability that the flag abs _ mvd _ grease 0_ flag [ XY ] indicating whether the absolute value of the difference vector exceeds 0 in the case of the outer layer prediction is 1 is significantly lower than the probability that the flag abs _ mvd _ grease 0_ flag [ XY ] indicating whether the absolute value of the difference vector exceeds 0 in the case of the same layer prediction (motion vector) in the other cases and the flag abs _ mvd _ grease 0_ flag [0] indicating whether the X component of the difference vector is 0 in the case of the outer view (displacement vector) is 1. In this way, considering the case where the occurrence probability differs depending on the conditions, it is possible to obtain an effect of reducing the sign amount of the difference vector by assigning different contexts to the contexts of the difference vector (motion vector) predicted in the same layer and the difference vector (displacement vector ) predicted in the outer layer.
Fig. 29 is a diagram showing another example of deriving table information.
Fig. 29(a) shows that 0 or 1 is taken as the range of values of the context index ctxIdxInc provided for the syntax element abs _ mvd _ great 0_ flag [ ].
Fig. 29(b) is derived table information representing a context index ctxIdxInc with respect to a syntax element abs _ mvd _ grease 0_ flag [ XY ].
In this example, when the reference layer information reference _ layer _ info is 0, the same context index is commonly assigned to the X component and the Y component of the difference vector of the flag abs _ mvd _ grease 0_ flag [ XY ], as in fig. 27 and 28 (ctxIdxInc is 0). When the reference layer information is 1, the context index ctxIdxInc is switched depending on whether the outer layer prediction is prediction between different view images (inter _ view) or prediction between the same view images (non inter _ view). In the case of prediction between different view images, as in fig. 27, a context index different from the X component is assigned to the Y component of the difference vector of the flag abs _ mvd _ grease 0_ flag [ XY ] (ctxIdxInc ═ 1). In the case of prediction between the same view images, as in fig. 28, a context index (ctxIdxInc ═ 1) different from that in the case of the same layer prediction is assigned to the X component and the Y component of the difference vector of the flag abs _ mvd _ grease 0_ flag [ XY ].
As described above, by using the case where the probability that the Y component of the differential vector becomes 0 is high when the displacement vector is a displacement vector of a vector between different viewpoint images, and the probability that the X component and the Y component of the differential vector become 0 is high when the displacement vector is a vector between the same viewpoint images, a context different from that in the case of the motion vector is assigned, it is possible to obtain an effect of reducing the sign amount of the differential vector.
Fig. 30 is a diagram showing another example of deriving table information.
Fig. 30(a) represents taking one of 0 to 2 as a range of values of a context index ctxIdxInc provided for a syntax element abs _ mvd _ great 0_ flag [ ]. I.e. 3 contexts are used in the decoding.
Fig. 30(b) is derived table information representing a context index ctxIdxInc with respect to a syntax element abs _ mvd _ grease 0_ flag [ XY ].
In this example, when the reference layer information reference _ layer _ info is 0 and the same layer is predicted, the same context index ctxIdxInc (ctxIdxInc ═ 0) is assigned to the X component and the Y component of abs _ mvd _ grease 0_ flag [ XY ]. In the case where the reference layer information reference _ layer _ info is 1 and the outer layer predicts, the X and Y components for the flag abs _ mvd _ greater0_ flag [ ] correspond to different contexts from those in the case of the same layer prediction and different context indices ctxIdxInc for the X and Y components. Here, ctxIdxInc is associated with the X component 1, and ctxIdxInc is associated with the Y component 2.
As described above, the same context index derivation method is used regardless of whether the displacement vector is a displacement vector of a vector between different viewpoint images or the displacement vector is a vector between the same viewpoint images, and different contexts are assigned according to the conditions of the displacement vector and the motion vector, whereby an effect of reducing the symbol amount of the difference vector can be obtained.
Fig. 31 is a diagram showing another example of deriving table information.
Fig. 31(a) shows that one of values 0 to 2 is taken as the range of values of the context index ctxIdxInc provided for the syntax element abs _ mvd _ great 0_ flag [ ].
Fig. 31(b) is derived table information representing a context index ctxIdxInc with respect to a syntax element abs _ mvd _ grease 0_ flag [ XY ].
In this example, when the reference layer information reference _ layer _ info is 1, that is, when the external layer prediction is performed, the X component and the Y component of the flag abs _ mvd _ greater0_ flag [ ] correspond to different context indices ctxIdxInc, and the Y component of the same layer prediction and the X component of the external layer prediction correspond to the same context index ctxIdxInc. Specifically, when the reference layer information reference _ layer _ info is 0, that is, when the same layer is predicted, ctxIdxInc is 0 for the X component and 1 for the Y component of the difference vector. When the reference layer information reference _ layer _ info is 1, that is, when the outer layer is predicted, ctxIdxInc is 1 for the X component of the differential vector and 2 for the Y component.
This is the case where the probability that the Y component of the difference vector predicted by the outer layer becomes 0 is high, the probability that the X component of the difference vector predicted by the same layer becomes 0 is relatively low, and the X component of the difference vector predicted by the other outer layer and the Y component of the difference vector predicted by the same layer become 0 with the same degree of probability. This can obtain an effect of reducing the symbol amount of the difference vector.
(modification D7)
Next, another modification D7 of the present embodiment will be described.
The image decoding device 31g (not shown) according to modification D7 includes an entropy decoding unit 301g instead of the entropy decoding unit 301f of the image decoding device 31 f. The image decoding device 31g according to modification D7 has the same configuration as the image decoding device 31f except for the entropy decoding unit 301 g. Therefore, the same reference numerals are given to the same structures, and the above description is applied.
Fig. 32 is a block diagram showing the configuration of the entropy decoding unit 301g according to modification D7.
As shown in fig. 32, the entropy decoding unit 301g of the present modification D7 includes a layer ID decoding unit 3014 and a target layer determination unit 30171 in addition to the configuration of the entropy decoding unit 301 (see fig. 24), and includes a vector difference syntax decoding unit 3013g instead of the vector difference syntax decoding unit 3013. The Layer ID decoding unit 3014 decodes the Layer ID from a NAL (Network Abstraction Layer) unit header included in the coded stream Te. The NAL unit header is information contained in the head of each of NAL units that are 1 constituent unit of the encoded stream Te. The NAL unit header includes decoding parameters including the layer ID described above.
The target layer determination unit 30171 identifies whether the decoding target image is a base layer image or a non-base layer image based on the decoded layer ID, and derives target layer information target _ layer _ info. The value of the target layer information target _ layer _ info is, for example, 0 in the case of the base layer image to be decoded and 1 in the case of the non-base layer image. The layer ID decoding unit 3014 may identify whether it is a base view image or a non-base view image, instead of identifying whether it is a base layer image or a non-base layer image. In this case, the layer ID decoding unit 3014 determines that the view of the decoding target image is the base view when the view ID of the decoding target image is 0, and determines that the view is not the base view when the view ID is other than 0. Here, the value of the target layer information target _ layer _ info is, for example, 0 if the view is a base view and 1 if the view is a non-base view. The view ID is derived from the layer ID using a table stored in advance indicating the correspondence between the layer ID and the view ID. The target layer determination unit 30171 extracts a table indicating the correspondence relationship from the VPS included in the encoded stream.
The vector difference syntax decoding unit 3013g of modification D7 derives the context index ctxIdxInc from the determined target layer information target _ layer _ info using the derivation table shown in fig. 33.
Fig. 33 is a diagram showing an example of the derivation table information according to modification D7.
Fig. 33(a) represents a value that takes 0 or 1 as a range of values of a context index ctxIdxInc provided for a syntax element abs _ mvd _ great 0_ flag [ ].
Fig. 33(b) is derived table information representing a context index ctxIdxInc with respect to a syntax element abs _ mvd _ grease 0_ flag [ XY ].
In this example, in the case of the base layer (base view) where the target layer information target _ layer _ info is 0, the same context is used for the X component and the Y component of the syntax element abs _ mvd _ greater0_ flag [ ], and in the case of the non-base layer (or non-base view), the X component and the Y component of the flag abs _ mvd _ greater0_ flag [ ] correspond to different context indices ctxIdxInc. In the case where the target layer information target _ layer _ info is 1 and is a non-base layer (inter layer, non-base view), the context index ctxIdxInc is 0 for the X component and 1 for the Y component. Thus, the context index 0 is used as the same context in the X component of the difference vector of the base layer (base view) and the context index 1 is used as a different context in the Y component of the difference vector of the non-base layer.
That is, when the target image to be decoded is the base layer, only images of the same layer are used as the reference image in principle, and therefore the vector used for decoding the base layer image is a motion vector which is a vector predicted in the same layer. Since a picture of a layer different from a picture of the same layer is also used as a reference picture in the non-base layer, there are cases where a vector used in decoding is a motion vector which is a vector predicted in the same layer and cases where the vector is a displacement vector which is a vector predicted in an outer layer. In this case, by assigning different contexts to the X component and the Y component of a flag (syntax element abs _ mvd _ grease 0_ flag [ XY ]) indicating whether or not the absolute value of the difference vector exceeds 0, the effect of reducing the symbol amount of the difference vector can be obtained. In modification D6 described above, the identification of the reference layer using the reference picture index is required, but modification D7 can be implemented without the identification of the reference layer and without the identification of the object layer. The object layer is different from the reference layer and is usually fixed regardless of the prediction unit (no reference picture index is necessary), so that the reference layer can be identified. Therefore, in modification D7, the decoding process is easier than in modification D6 described above.
(modification D8)
Next, another modification D8 of the present embodiment will be described. Modification D8 reduces the memory required for the prediction parameter memory 307 by limiting the reference coordinates when the AMVP prediction parameter derivation unit 3032 and the merged prediction parameter derivation unit 3036 in the image decoding device 31 refer to the prediction parameters (vector and reference list index) stored in the prediction parameter memory 307. More specifically, when the prediction parameters of already processed blocks of a layer picture different from the layer picture of the target picture to be decoded are referred to, the reference picture is restricted.
The image decoding apparatus 32h according to modification D8 includes an AMVP prediction parameter derivation unit 3032h in place of the AMVP prediction parameter derivation unit 3032 in the external prediction parameter decoding unit 303 (fig. 3).
Fig. 36 is a block diagram showing the configuration of the AMVP prediction parameter derivation unit 3032h and the prediction parameter memory 307 according to modification D8.
As already described, the AMVP prediction parameter derivation part 3032h includes the vector candidate derivation part 3033 and the prediction vector selection part 3034. Further, the AMVP prediction parameter derivation part 3032h includes a prediction parameter reference part 3039. The prediction parameter referring unit 3039 includes a spatial prediction reference address converting unit 30391, a temporal prediction reference address converting unit 30392, and an external layer reference address converting unit 30393 (inter-layer reference address converting unit).
The vector candidate derivation unit 3033 included in the AMVP prediction parameter derivation unit 3032h refers to the prediction parameters of the already decoded prediction unit PU to derive vector candidates.
At this time, the prediction parameters of the prediction unit in the target block to be decoded, that is, the prediction unit in the target CTB (target CTB prediction parameters), the prediction parameters of the prediction unit of the CTB positioned on the left side of the target CTB (left CTB column prediction parameters), and the prediction parameters of the prediction unit of the CTB positioned on the upper side of the target CTB (upper CTB line prediction parameters) are referred to the prediction parameters of the prediction unit of the previous picture and the prediction parameters of the prediction unit of the layer different from the target layer (outer layer prediction parameters) in the decoding order (for example, the order of raster scanning). The prediction parameter reference unit 3039 refers to the target CTB prediction parameter memory 3071, the left CTB column prediction parameter memory 3072, the upper CTB line prediction parameter memory 3073, the temporal prediction parameter memory 3074, and the outer layer prediction parameter memory 3075, respectively, which are included in the prediction parameter memory 307. At this time, the spatial prediction reference address converting unit 30391 and the temporal prediction reference address converting unit 30392 convert the coordinates of the reference destination. The operation of the external layer reference address converter 30393 will be described later.
Note that the same processing may be performed when the vector candidate derivation unit 3033 in the AMVP prediction parameter derivation unit 3032h refers to the prediction parameters of a layer image of a layer different from the layer of the decoding target image when deriving the vector candidate as the merge candidate is the extended vector candidate as in the case of deriving the vector candidate in the outer layer vector candidate derivation unit 30337 (see fig. 11). Note that the processing in the case where the prediction parameters of the layer image of a layer other than the layer of the decoding target image are referred to when the merge candidate is derived by the merge candidate derivation unit 30364 may be the same as the processing in the case where the merge candidate is an extended merge candidate derived by the outer layer merge candidate derivation unit 30363 (see fig. 13).
Fig. 37 is a diagram showing an example of equations used for coordinate transformation.
As shown in the formula of fig. 37 a, when the prediction unit to be referred to is present in the same image (image at the same layer and the same display time) as the image of the target prediction unit, the spatial prediction reference address converter 30391 converts the X coordinate xP to xP' by the following formula when the reference coordinate (xP-1, yP) is located above the target CTB (when yP-1 is smaller than ((yC > Log2 ctbsiyy) < Log2 ctbsiyy)).
xP′=(xP>>3)<<3)+((xP>>3)&1)*7
In this case, conversion is performed such that 0 is obtained when xP is 0 to 7, and 15 is obtained when xP is 8 to 15. The prediction parameter reference unit 3039 refers to the prediction parameters using the transformed coordinates.
As shown in the equation of fig. 37(b), the temporal prediction reference address converting unit 30392 converts the coordinates (xP, yP) of the reference into (xP ', yP') when the prediction unit to be referred to is present in the same layer as the target prediction unit in the vector candidate derivation by the vector candidate deriving unit 3033 and at a different display time.
xP′=(xP>>4)<<4
yP′=(yP>>4)<<4
The prediction parameter reference unit 3039 uses the transformed coordinates to perform reference. By the above conversion, both the X coordinate and the Y coordinate of the coordinate to be referred to become multiples of 16.
Next, an example in which the above-described coordinate transformation is performed and the region of the prediction parameter is referred to in the prediction parameter memory 307 will be described.
Fig. 38 is a conceptual diagram illustrating an example of a region to be referred to by coordinate transformation.
Fig. 38 shows a reference area for a layer 1 image in the upper stage and a reference area for a layer 0 image in the lower stage. The left and right indicate the decoding order.
Fig. 38 shows a picture of the object prediction unit (layer 1(POC _ curr)) at the current time (POC _ curr), a picture of the prediction unit that has been decoded and is the same layer (layer 1) as the layer of the object picture (layer 0(POC _ ref)) at the past time (POC _ ref), and a picture of a layer (layer 0(POC _ curr)) different from the object prediction unit (layer 0) at the current time (POC _ curr). In this example, the case where the regions of 2 pictures (layer 0(POC _ ref) and layer 0(POC _ curr)) are referred to is shown.
As shown in fig. 38, in the prediction parameter reference of the CTB located above the target CTB, the spatial prediction reference address conversion unit 30391 performs conversion (limitation) with reference to the address, and the prediction parameter is referred to in units of 8 × 4. The prediction parameter of the prediction unit at the display time of the target image in a layer (layer 1) different from the layer of the target image is referred to by 16 × 16 units by the temporal prediction reference address conversion unit 30392. In contrast, the prediction parameters of the layers other than the target image are referred to in units of the minimum PU (8 × 4, 4 × 8). In this case, the prediction parameters need to be stored in the outer layer prediction parameter memory 3077 in 4 × 4 units, which is the greatest common divisor of the smallest PU units, and a very large prediction parameter memory is required.
Returning to fig. 36, the prediction parameter referencing unit 3039 refers to the prediction parameters stored in the prediction parameter memory 391 when deriving a merge candidate in the merge candidate deriving unit 30364, which will be described later.
Here, the outer-layer reference address converting unit 30393 converts the address to be referred to when referring to the prediction parameter of the layer image of the layer other than the decoding target image when the vector candidate is derived by the AMVP prediction parameter deriving unit 3032h and when the merge candidate is derived by the merge candidate deriving unit 30361, and when referring to the outer-layer prediction parameter memory 3075 in the prediction parameter memory 307 storing the prediction parameter of the layer image of the different layer.
Next, a block diagram showing the configuration of the merged prediction parameter deriving unit 3036h and the prediction parameter memory 307 according to the present modification is shown.
Fig. 39 is a block diagram showing the configuration of the merged prediction parameter deriving unit 3036h and the prediction parameter memory 307 according to modification D8.
The merge prediction parameter derivation part 3036h includes a merge candidate derivation part 30364 and a merge candidate selection part 30365. Further, the merged prediction parameter deriving unit 3036h includes a prediction parameter referring unit 3039 (see fig. 36).
The merge candidate derivation unit 30364 derives the prediction parameters referred to by the prediction parameter reference unit 3039 as merge candidates, and outputs the derived merge candidates to the merge candidate selection unit 30365.
The merge candidate selection unit 30365 selects a prediction parameter (for example, the vector mvLX and the reference picture index refIdxLX) as a merge candidate indicated by the input merge index merge _ idx, from among the derived merge candidates. The merge candidate selection unit 30365 outputs the selected merge candidate.
Fig. 40 is a diagram showing another example of equations used for coordinate transformation.
The external-layer-reference-address converting unit 30393 uses the equation shown in fig. 40(a) when converting the reference coordinates (xP, yP) into (xP ', yP') when performing external layer prediction with reference to the layer information reference _ layer _ info of 1 in the vector candidate derivation by the vector candidate deriving unit 3033.
xP′=(xP>>3)<<3
yP′=(yP>>3)<<3
The operation is an operation for reducing the lower 3 bits to 0, and can also be derived from the following expression using the logical sum operation &.
xP′=(xP&~3)
yP′=(yP&~3)
Here, the "is an operation indicating logical not.
The coordinate transformation in the merge candidate derivation by the merge candidate derivation unit 30364 may be the same as this.
Fig. 41 is a conceptual diagram illustrating another example of the region to be referred to by the coordinate transformation.
Fig. 41 shows the reference region of the coordinates converted by the external layer reference address converting unit 30393 for the target prediction means and the reference prediction means shown in fig. 40.
By the above-described coordinate transformation, in the outer layer prediction, the prediction parameter memory is referred to in 8 × 8 units (layer 0(POC _ curr)). In this example, the prediction parameter memory is referred to for each wide area, which is more effective than a case where the prediction parameter memory is referred to for each area unit smaller than this, for example, 4 × 4 units (or 4 × 8 and 8 × 4 units). That is, even when the prediction parameter of a certain layer image is referred to in decoding of another layer image (prediction parameter external layer prediction is performed), the reference address of the prediction parameter of the certain layer image is limited to a predetermined address, and therefore, compression can be performed in which the prediction parameter other than the predetermined address is deleted.
Returning to fig. 40, when the reference layer information reference _ layer _ info is 1 in the vector candidate derivation by the vector candidate derivation unit 3033, that is, when performing the outer layer prediction, the outer layer reference address conversion unit 30393 converts the coordinates (xP, yP) to be referred to into (xP ', yP'), the equation of fig. 40(b) is used.
xP′=(xP>>3)<<3)+((xP>>3)&1)*7
yP′=(yP>>3)<<3)+((yP>>3)&1)*7
Fig. 42 is a conceptual diagram illustrating another example of the region to be referred to by the coordinate transformation.
Fig. 42 shows the reference region of the coordinates converted by the external layer reference address converting unit 30393 for the target prediction means and the reference prediction means shown in fig. 40.
The example shown in fig. 42 shows that the prediction parameter memory is referred to from the region in the corner of each block indicating the size of 16 × 16 units in layer 0(POC _ curr). In this case, in the outer layer prediction, the prediction parameter memory is not referred to in small units of 4 × 4 units (or 4 × 8 and 8 × 4 units), and therefore the memory amount of the prediction parameter memory is effectively reduced.
(Structure of image encoding device)
Next, the configuration of the image coding device 11 according to the present embodiment will be described.
Fig. 14 is a block diagram showing the configuration of the image coding device 11 according to the present embodiment.
The image encoding device 11 includes a predicted image generation unit (predicted image generation unit) 101, a subtraction unit 102, a DCT/quantization unit 103, an entropy encoding unit 104, an inverse quantization/inverse DCT unit 105, an addition unit 106, a prediction parameter memory 108, a reference image memory (reference image storage unit) 109, an encoding parameter determination unit 110, and a prediction parameter encoding unit 111. The prediction parameter encoding unit 111 includes an external prediction parameter encoding unit 112 and an internal prediction parameter encoding unit 113.
The predicted image generating unit 101 generates a predicted image block P for each image of each viewpoint of a layer image T inputted from the outside, for each block which is an area into which the image is divided. Here, the predicted image generating unit 101 reads out the reference image block from the reference image memory 109 based on the prediction parameters input from the prediction parameter encoding unit 111. The prediction parameter input from the prediction parameter encoding unit 111 is, for example, a motion vector mvLX or a displacement vector dvLX. The predicted image generation unit 101 reads a reference image block of a block located at a position indicated by a predicted motion vector or displacement vector with the block to be encoded as a starting point. The predicted image generation unit 101 generates a predicted image block P using 1 prediction mode out of a plurality of prediction modes with respect to the read reference image block. The predicted image generating unit 101 outputs the generated predicted image block P to the subtracting unit 102.
The predicted image generation unit 101 selects a prediction scheme that minimizes an error value based on a difference between a signal value of each pixel of a block included in the layer image and a signal value of each corresponding pixel of the predicted image block P, for example, in order to select the prediction scheme. The method of selecting the prediction method is not limited to this.
In the case where the image to be encoded is a base view image, the plurality of prediction methods are intra prediction, motion prediction, and merge prediction. The motion prediction is prediction between time points in the above-described external prediction. The merged prediction is a prediction using the same reference image block and prediction parameters as those of a block that has already been encoded and is located within a predetermined range from the block to be encoded. In the case where the image to be encoded is a non-base view image, the plurality of prediction modes are intra prediction, motion prediction, merge prediction, and displacement prediction. The displacement prediction (parallax prediction) is prediction between different layer images (different viewpoint images) in the above-described external prediction.
When intra prediction is selected, the predicted image generating unit 101 outputs a prediction mode predMode indicating an intra prediction mode used when generating the predicted image block P to the prediction parameter encoding unit 111.
When motion prediction is selected, the predicted image generating unit 101 stores a motion vector mvLX used when generating the predicted image block P in the prediction parameter memory 108 and outputs the motion vector mvLX to the external prediction parameter encoding unit 112. The motion vector mvLX represents a vector from the position of the coding target block to the position of the reference picture block when the predicted picture block P is generated. The information indicating the motion vector mvLX may include information indicating a reference picture (for example, the reference picture index refIdxLX, the picture sequence number POC), or may be information indicating a prediction parameter. The predicted image generating unit 101 outputs a prediction mode predMode indicating an external prediction mode to the prediction parameter encoding unit 111.
When the displacement prediction is selected, the predicted image generating unit 101 stores the displacement vector dvLX used when generating the predicted image block P in the prediction parameter memory 108, and outputs the displacement vector dvLX to the external prediction parameter encoding unit 112. The displacement vector dvLX represents a vector from the position of the encoding target block to the position of the reference image block when the predicted image block P is generated. The information indicating the displacement vector dvLX may include information indicating a reference picture (for example, a reference picture index refIdxLX, a view identifier view _ id), or may be information indicating a prediction parameter. The predicted image generating unit 101 outputs a prediction mode predMode indicating an external prediction mode to the prediction parameter encoding unit 111.
When merge prediction is selected, the predicted image generation unit 101 outputs a merge index merge _ idx indicating the selected reference image block to the external prediction parameter encoding unit 112. The predicted image generating unit 101 outputs a prediction mode predMode indicating the merged prediction mode to the prediction parameter encoding unit 111.
The subtraction unit 102 subtracts the signal value of the predicted image block P input from the predicted image generation unit 101 from the signal value of the corresponding block of the layer image T input from the outside, for each pixel, and generates a residual signal. The subtraction unit 102 outputs the generated residual signal to the DCT/quantization unit 103 and the coding parameter determination unit 110.
The DCT/quantization unit 103 performs DCT on the residual signal input from the subtraction unit 102 to calculate a DCT coefficient. The DCT/quantization unit 103 quantizes the calculated DCT coefficient to obtain a quantization coefficient. The DCT/quantization unit 103 outputs the obtained quantization coefficient to the entropy coding unit 104 and the inverse quantization/inverse DCT unit 105.
The entropy encoding unit 104 receives the quantized coefficients from the DCT/quantization unit 103, and receives the encoding parameters from the encoding parameter determination unit 110. The inputted coding parameters include, for example, reference picture index refIdxLX, vector index mvp _ LX _ idx, difference vector mvdLX, prediction mode predMode, and merge index merge _ idx.
The entropy encoding unit 104 entropy encodes the input quantization coefficient and encoding parameter to generate an encoded stream Te, and outputs the generated encoded stream Te to the outside.
The inverse quantization/inverse DCT unit 105 inversely quantizes the quantization coefficient input from the DCT/quantization unit 103 to obtain a DCT coefficient. The inverse quantization/inverse DCT unit 105 performs inverse DCT on the obtained DCT coefficient to calculate a decoded residual signal. The inverse quantization/inverse DCT unit 105 outputs the calculated decoded residual signal to the adder 106.
The adder 106 adds the signal value of the predicted image block P input from the predicted image generator 101 and the signal value of the decoded residual signal input from the inverse quantization/inverse DCT unit 105 for each pixel to generate a reference image block. The adder 106 stores the generated reference image block in the reference image memory 109.
The prediction parameter memory 108 stores the prediction parameters generated by the prediction parameter encoding unit 111 at predetermined positions for each image and block to be encoded.
The reference image memory 109 stores the reference image block generated by the addition unit 106 at a predetermined position for each image and block to be encoded.
The encoding parameter determining unit 110 selects 1 group from a plurality of groups of encoding parameters. The encoding parameter is the above-mentioned prediction parameter or a parameter to be encoded generated in association with the prediction parameter. The predicted image generating unit 101 generates a predicted image block P using each of these sets of encoding parameters.
The encoding parameter determination unit 110 calculates a cost value indicating the size of the information amount and the encoding error for each of the plurality of groups. The cost value is for example the sum of the symbol amount and the value of the squared error multiplied by a factor lambda. The symbol amount is an information amount of the encoded stream Te obtained by entropy encoding the quantization error and the encoding parameter. The square error is the sum of pixels of the square value of the residual signal calculated by the subtraction unit 102. The coefficient λ is a preset real number greater than zero. The encoding parameter determination unit 110 selects a set of encoding parameters for which the calculated cost value is the smallest. Thus, the entropy encoding unit 104 outputs the selected set of encoding parameters to the outside as the encoded stream Te, and does not output the unselected sets of encoding parameters.
The prediction parameter encoding unit 111 derives a prediction parameter used when generating a predicted image based on the parameter input from the predicted image generating unit 101, and encodes the derived prediction parameter to generate a set of encoding parameters. The prediction parameter encoding unit 111 outputs the generated set of encoding parameters to the entropy encoding unit 104.
The prediction parameter encoding unit 111 stores, in the prediction parameter memory 108, prediction parameters corresponding to the set selected by the encoding parameter determination unit 110 among the generated sets of encoding parameters.
When the prediction mode predMode input from the predicted image generation unit 101 indicates the external prediction mode, the prediction parameter encoding unit 111 operates the external prediction parameter encoding unit 112. When the prediction mode predMode indicates the intra prediction mode, the prediction parameter encoding unit 111 operates the intra prediction parameter encoding unit 113.
The extrinsic prediction parameter encoding unit 112 derives an extrinsic prediction parameter based on the prediction parameter input from the encoding parameter determination unit 110. The extrinsic prediction parameter encoding unit 112 includes the same configuration as the configuration in which the extrinsic prediction parameter decoding unit 303 (see fig. 2 and the like) derives the extrinsic prediction parameters, as the configuration in which the extrinsic prediction parameters are derived. The configuration of the outer prediction parameter encoding unit 112 will be described later.
The intra-prediction parameter encoding unit 113 specifies the intra-prediction mode IntraPredMode indicated by the prediction mode predMode input from the encoding parameter determination unit 110 as a set of the intra-prediction parameters.
(configuration of external prediction parameter coding section)
Next, the configuration of the external prediction parameter encoding unit 112 will be described.
Fig. 15 is a schematic diagram showing the configuration of the external prediction parameter encoding unit 112 according to the present embodiment.
The external prediction parameter encoding unit 112 includes a merged prediction parameter deriving unit (displacement vector generating unit, prediction parameter deriving unit) 1121, an AMVP prediction parameter deriving unit (displacement vector generating unit, prediction parameter deriving unit) 1122, a subtracting unit 1123, a displacement vector clipping unit (displacement vector limiting unit) 1124, and a prediction parameter merging unit 1126.
The merged prediction parameter derivation unit 1121 has the same configuration as the merged prediction parameter derivation unit 3036 (see fig. 3) described above.
When the prediction mode predMode input from the predicted image generation unit 101 indicates the merge prediction mode, the merge index merge _ idx is input from the coding parameter determination unit 110 to the merge prediction parameter derivation unit 1121. The merge index merge _ idx is output to the prediction parameter merge unit 1126. The merged prediction parameter deriving unit 1121 reads the reference picture index refIdxLX and the vector mvLX of the reference block indicated by the merged index merge _ idx in the merge candidates from the prediction parameter memory 108. The merge candidate is a reference block located in a predetermined range from the encoding target block to be encoded (for example, in reference blocks connected to the lower left end, upper left end, and upper right end of the encoding target block) and is a reference block for which the encoding process is completed. The merged prediction parameter deriving unit 1121 outputs the read prediction vector mvpLX and the reference picture index refIdxLX to the displacement vector limiter unit 1124 in association with each other.
The AMVP prediction parameter derivation unit 1122 has the same configuration as the AMVP prediction parameter derivation unit 3032 (see fig. 3) described above.
When the prediction mode predMode input from the predicted image generation unit 101 indicates the external prediction mode, the AMVP prediction parameter derivation unit 1122 receives the vector mvLX from the coding parameter determination unit 110. The AMVP prediction parameter derivation unit 1122 derives a prediction vector mvpLX based on the input vector mvLX. The AMVP prediction parameter derivation unit 1122 outputs the derived prediction vector mvpLX to the subtraction unit 1123. The reference picture index refIdx and the vector index mvp _ LX _ idx are output to the prediction parameter combining unit 1126.
The subtraction unit 1123 subtracts the prediction vector mvpLX input from the AMVP prediction parameter derivation unit 1122 from the vector mvLX input from the encoding parameter determination unit 110, and generates a difference vector mvdLX. When the vector mvLX is an outer layer prediction (displacement vector), the subtraction unit 1123 omits the Y component mvd _ Y included in the difference vector mvdLX and leaves the X component mvd _ X to generate a one-dimensional difference vector (scalar value) mvdLX. As in the reference layer determination unit 30311 (see fig. 4), the determination whether the vector mvLX is a displacement vector or a motion vector is performed based on the reference image index refIdx. The subtraction unit 1123 outputs the generated one-dimensional difference vector mvdLX to the prediction parameter combination unit 1126. When the vector mvLX is a motion vector, the subtraction unit 1123 outputs the Y component mvd _ Y included in the difference vector mvdLX to the prediction parameter combination unit 1126 as it is without being omitted.
The displacement vector limiter 1124 receives the vector mvLX, the reference picture index refIdx, and the vector index mvp _ LX _ idx from the combined prediction parameter derivation unit 1121 or the coding parameter determination unit 110. When the vector mvLX is a displacement vector, the displacement vector limiter 1124 determines the value of the Y component dv _ Y, which is an element of the input displacement vector dvLX, to be a predetermined value (e.g., zero). The displacement vector limiter unit 1124 maintains the value of the X component dv _ X of the displacement vector dvLX as it is. The displacement vector limiter unit 1124 outputs the displacement vector dvLX, which is a predetermined value of the Y component dv _ Y, to the predicted image generator 101 in association with the reference image index refIdx and the vector index mvp _ LX _ idx, and stores the result in the prediction parameter memory 108.
As in the reference layer determination unit 30311 (see fig. 4), the determination whether the vector mvLX is a displacement vector or a motion vector is performed based on the input reference image index refIdx.
When the vector mvLX is a motion vector, the displacement vector limiter unit 1124 outputs the input vector mvLX to the predicted image generator 101 in association with the reference image index refIdx and the vector index mv _ LX _ idx as it is, and stores the vector in the prediction parameter memory 108.
When the prediction mode predMode input from the predicted image generation unit 101 indicates the merge prediction mode, the prediction parameter merge unit 1126 outputs the merge index merge _ idx input from the encoding parameter determination unit 110 to the entropy encoding unit 104.
When the prediction mode predMode input from the predicted image generation unit 101 indicates the external prediction mode, the prediction parameter combination unit 1126 performs the following processing.
The prediction parameter combining unit 1126 determines whether the difference vector mvdLX input from the subtracting unit 1123 is a difference vector for a displacement vector or a difference vector for a motion vector. As in the case of the reference layer determination unit 30311 (see fig. 4), determination of whether a difference vector is a displacement vector or a difference vector is a motion vector is performed based on the input reference picture index refIdx.
When the prediction parameter combining unit 1126 determines that the input differential vector mvdLX is a differential vector relating to a displacement vector, the prediction parameter combining unit 1126 specifies the value of the view arrangement flag camera _ arrangement _1D _ flag to 1. The prediction parameter combining unit 1126 calculates the 3 types of symbols, abs _ mvd _ generator 0_ flag [0], abs _ mvd _ minus2[0], and mvd _ sign _ flag [0], based on the X component mvd _ X indicated by the difference vector mvdLX.
The prediction parameter combining unit 1126 does not calculate these signs for the Y component mvd _ Y.
When the prediction parameter combining unit 1126 determines that the input differential vector mvdLX is a differential vector relating to a motion vector, the prediction parameter combining unit 1126 specifies the value of the view arrangement flag camera _ arrangement _1D _ flag to 0.
The prediction parameter combining unit 1126 calculates the 3 types of symbols, abs _ mvd _ generator 0_ flag [0], abs _ mvd _ minus2[0], and mvd _ sign _ flag [0], based on the X component mvd _ X indicated by the difference vector mvdLX. The prediction parameter combining unit 1126 calculates 3 kinds of symbols, abs _ mvd _ generator 0_ flag [1], abs _ mvd _ minus2[1], and mvd _ sign _ flag [1], for the Y component mvd _ Y indicated by the difference vector mvdLX.
The prediction parameter combining unit 1126 combines the view allocation flag camera _ arrangement _1D _ flag having a fixed value and the 3 kinds of symbols calculated for each component, and the reference picture index refIdx and the vector index mvp _ LX _ idx input from the coding parameter determining unit 110. The prediction parameter combining unit 1126 outputs the combined symbols to the entropy coding unit 104.
Thus, in the above example, even if the Y component dvd _ Y of the differential vector dvlx relating to the displacement vector is not encoded, the value thereof is determined to be a predetermined value. On the other hand, even in the case where the X component is dominant and the Y component is a negligible value in the displacement, the accuracy in the displacement prediction does not deteriorate in the above example. Therefore, in the above example, the quality of the decoded image is not deteriorated and the encoding efficiency is improved.
(modification E1)
Next, an image encoding device 11a according to modification E1 will be described. Modification E1 is a modification corresponding to modification D1 described above. The same reference numerals are given to the same structures as described above, and the above description is applied.
The image encoding device 11a according to modification E1 includes an AMVP prediction parameter derivation unit 1122a instead of the AMVP prediction parameter derivation unit 1122 (see fig. 15). In addition, the subtracting unit 1123a described later may be included instead of the subtracting unit 1123, or the subtracting unit 1123a described later may not be included. When the subtraction unit 1123a is not included, the displacement vector limiter 1124 (see fig. 15) may be omitted.
The AMVP prediction parameter derivation unit 1122a has the same configuration as the AMVP prediction parameter derivation unit 1122 (see fig. 15). However, when the calculated prediction vector mvpLX is a prediction vector related to a displacement vector, the AMVP prediction parameter derivation unit 1122a limits the Y component dvp _ Y of the prediction vector to a value within a predetermined range, as in the displacement prediction vector limiter unit 30321a (see fig. 5). As in the case of the reference layer determination unit 30311 (see modification D1 and fig. 4), determination as to whether the vector of the prediction vector mvpLX is a displacement vector or a motion vector is performed based on the input reference image index refIdx.
The AMVP prediction parameter derivation unit 1122a reconstructs the prediction vector dvpLX from the Y component dvp _ Y in which the sum of the X component dvp _ X is limited, and outputs the reconstructed prediction vector dvpLX to the subtraction unit 1123.
When the prediction vector mvpLX is a prediction vector of a motion vector, the AMVP prediction parameter derivation unit 1122a outputs the calculated prediction vector mvpLX to the subtraction unit 1123 as it is.
The subtraction unit 1123a subtracts the prediction vector mvpLX input from the AMVP prediction parameter derivation unit 1122a from the vector mvLX input from the predicted image generation unit 101, and calculates a difference vector mvdLX. The subtraction unit 1123a outputs the calculated difference vector mvdLX to the entropy encoding unit 104, regardless of whether or not the vector mvLX is a displacement vector.
Thus, since the Y component dvp _ Y of the prediction vector relating to the displacement vector is limited to a value in a predetermined range, deterioration in accuracy in displacement prediction can be suppressed. Therefore, according to the present modification E1, the coding efficiency is improved.
(modification E2)
Next, an image encoding device 11b according to modification E2 will be described. Modification E2 is a modification corresponding to modification D2 described above. The same reference numerals are given to the same structures, and the above description is applied.
As in modification E1, the image coding apparatus 11b according to modification E2 includes an AMVP prediction parameter derivation unit 1122a and a displacement vector limiter unit 1124a instead of the AMVP prediction parameter derivation unit 1122 and the displacement vector limiter unit 1124 (see fig. 15).
In the present modification E2, the above-described displacement vector restriction information disparity _ restriction is externally input to the external prediction parameter encoding unit 112 (see fig. 15). When the vector mvLX is a displacement vector, the AMVP prediction parameter derivation unit 1122a, the subtraction unit 1123, and the displacement vector limiter unit 1124a perform the following processing based on the value of the displacement vector limit information disparity _ restriction.
When the value of the displacement vector restriction information disparity _ restriction is zero or 2, the AMVP prediction parameter derivation unit 1122a outputs the calculated prediction vector dvpLX as it is to the subtraction unit 1123 and the displacement vector limiter unit 1124 a.
When the value of the displacement vector restriction information disparity _ restriction is 1 or 3, the AMVP prediction parameter derivation unit 1122a restricts the Y component dvp _ Y of the calculated prediction vector dvpLX to a value within a predetermined range. The AMVP prediction parameter derivation unit 1122a reconstructs the prediction vector dvpLX from the X component dvp _ X of the prediction vector dvpLX and the Y component dvp _ Y whose value range is limited, and outputs the reconstructed prediction vector dvpLX to the subtraction unit 1123 and the displacement vector limiter unit 1124 a.
When the value of the displacement vector restriction information disparity _ restriction is zero or 1, the subtraction unit 1123 outputs the calculated difference vector dvdLX (two-dimensional vector) to the prediction parameter combining unit 1126 and the displacement vector clipping unit 1124a as it is. When the value of the displacement vector restriction information disparity _ restriction is 2 or 3, the subtraction unit 1123 omits the Y component dvd _ Y of the calculated differential vector dvdLX and leaves the X component dvd _ X to generate a one-dimensional differential vector (scalar value) dvdLX. The subtraction unit 1123 outputs the one-dimensional difference vector dvdLX in which the X component dvd _ X is left, to the prediction parameter combining unit 1126 and the displacement vector limiter 1124 a.
The displacement vector limiter 1124a adds the prediction vector dvpLX input from the AMVP prediction parameter derivation unit 1122a and the difference vector dvpLX input from the subtraction unit 1123 to newly calculate the displacement vector dvLX. Here, when the difference vector dvdLX input from the subtraction unit 1123 is a one-dimensional difference vector, a value in a predetermined range (for example, zero) is added as the Y component dvd _ Y of the difference vector dvdLX. The displacement vector limiter unit 1124a outputs the newly calculated displacement vector dvLX to the predicted image generator 101 in association with the reference image index refIdx and the vector index mvp _ LX _ idx, and stores the displacement vector dvLX in the prediction parameter memory 108.
Here, the reference picture index refIdx and the vector index mvp _ LX _ idx are input from the coding parameter determination unit 110 to the shift vector limiter unit 1124 a.
However, unlike the displacement vector limiter unit 1124, the vector mvLX (displacement vector dvLX) is not input from the coding parameter determination unit 110 to the displacement vector limiter unit 1124 a. Therefore, when the value of the displacement vector restriction information disparity _ restriction is any one of 1, 2, and 3, the range of the value of the Y component dv _ Y of the displacement vector dvLX to be output is restricted.
The prediction parameter combining unit 1126 receives the displacement vector restriction information disparity _ restriction as input. The prediction parameter combining unit 1126 combines the input other symbols and outputs the combined symbols to the entropy coding unit 104. Accordingly, the displacement vector restriction information disparity _ restriction is also an object of entropy encoding.
In modification E2, the subtraction unit 1123 and the displacement vector limiter 1124a may omit the determination as to whether the vector mvLX is a displacement vector or a motion vector.
In modification E2, the motion vector restriction information disparity _ restriction may be included in each motion or in each sequence in the encoded stream Te.
Thus, in modification E2, whether or not the difference or the predicted value of the displacement (viewpoint) in the Y direction is used can be switched according to the arrangement of the viewpoints of the acquired images or the scene of the captured images. As a whole scene, both the suppression of the degradation of the image quality and the reduction of the information amount of the bit stream Te can be achieved.
(modification E3)
Next, an image encoding device 11c according to modification E3 will be described. Modification E3 is a modification corresponding to modification D3 described above. The same reference numerals are given to the same structures, and the above description is applied.
The image encoding device 11c according to modification E3 includes an external prediction parameter encoding unit 112c instead of the external prediction parameter encoding unit 112 (see fig. 15).
Fig. 16 shows a schematic diagram illustrating the configuration of the external prediction parameter encoding unit 112c according to the modification E3.
The outer prediction parameter encoding unit 112c includes a displacement vector generator 1125 in place of the displacement vector limiter 1124 in the outer prediction parameter encoding unit 112.
In the present modification E3, the above-described slope coefficient inter _ view _ grad and slice coefficient inter _ view _ offset are input to the displacement vector generator 1125 from the outside.
When the vector mvLX is a displacement vector, the displacement vector generator 1125 receives the displacement vector dvLX, the reference picture index refIdx, and the vector index mvp _ LX _ idx as input from the merged prediction parameter deriving unit 1121 or the coding parameter determining unit. The displacement vector generator 1125 calculates the Y component dv _ Y based on the X component dv _ X of the displacement vector dvLX, the slope coefficient inter _ view _ grad, and the slice coefficient inter _ view _ offset. The process of calculating the Y component dv _ Y may be the same as the displacement vector setting unit 30382 (see modification D3 and fig. 7). As in the reference layer determination unit 30311 (see fig. 4), the determination whether the vector mvLX is a displacement vector or a motion vector is performed based on the input reference image index refIdx. The displacement vector generator 1125 reconstructs the displacement vector mvLX from the X component dv _ X and the calculated Y component dv _ Y. The displacement vector generator 1125 may calculate the Y component dv _ Y based on the slice coefficient inter _ view _ offset without using the X component dv _ X of the displacement vector dvLX. That is, dv _ y may be inter _ view _ offset. In this case, since it means that the slope coefficient inter _ view _ grad is always 0, it is not necessary to encode the component.
The displacement vector generator 1125 outputs the reconstructed displacement vector dvLX to the predicted image generator 101 in association with the reference image index refIdx and the vector index mvp _ LX _ idx, and stores the result in the prediction parameter memory 108.
When the vector mvLX is a motion vector, the displacement vector generator 1125 outputs the input motion vector mvLX to the predicted image generator 101 while corresponding to the reference image index refIdx and the vector index mvp _ LX _ idx, and stores the result in the prediction parameter memory 108.
The displacement vector generator 1125 outputs the slope coefficient inter _ view _ grad and the slice coefficient inter _ view _ offset to the prediction parameter combiner 1126. The prediction parameter combining unit 1126 combines the displacement vector restriction information disparity _ restriction input from the displacement vector generating unit 1125 with the other input symbols described above, and outputs the combined symbols to the entropy encoding unit 104. Accordingly, the slope coefficient inter _ view _ grad and the slice coefficient inter _ view _ offset are also subject to entropy encoding.
In the encoded stream Te, the slope coefficient inter _ view _ grad and the slice coefficient inter _ view _ offset may be included in each sequence or in each picture. In particular, by making these coefficients included in each image, it is possible to follow extremely fine variations in image units of a scene.
Thus, in modification E3, even when the displacement (parallax) in the Y direction does not become a predetermined value at all when the arrangement of the viewpoints of the acquired images or the orientation thereof is not exactly parallel to one direction, the displacement in the Y direction can be reproduced based on the displacement (parallax) in the X direction. In modification E3, by switching the relationship between the displacement in the X direction and the displacement in the Y direction according to the scene of the captured image, it is possible to achieve both suppression of degradation in image quality and reduction in the information amount of the bit stream Te as a whole.
(modification E4)
Next, an image coding device 11d according to modification E4 will be described. Modification E4 is a modification corresponding to modification D4 described above. The same reference numerals are given to the same structures, and the above description is applied.
The image encoding device 11d according to modification E4 includes a displacement vector limiter unit 1124d instead of the displacement vector limiter unit 1124 (see fig. 15).
The image coding device 11d may include a subtraction unit 1123a (see modification E1) instead of the subtraction unit 1123 (see fig. 15).
The displacement vector limiter 1124d performs the same processing as the displacement vector limiter 1124. The displacement vector limiter 1124d receives the vector mvLX, the reference picture index refIdx, and the vector index mvp _ LX _ idx from the merged prediction parameter derivation unit 1121 or the coding parameter determination unit 110. When the vector mvLX is a displacement vector, the displacement vector limiter unit 1124d limits the value of the displacement vector dvLX to be input to a value within a predetermined range, similarly to the vector limiter unit 30372. The displacement vector limiter 1124d outputs the value-limited displacement vector dvLX to the predicted image generator 101 in association with the reference image index refIdx and the vector index mvp _ LX _ idx, and stores the result in the prediction parameter memory 108.
As in the case of the reference layer determination unit 30311 (see modification D4 and fig. 9), the displacement vector limiter unit 1124D determines whether or not the vector mvLX is a displacement vector based on the reference image index refIdx. When the vector mvLX is a motion vector, the displacement vector limiter unit 1124d outputs the input motion vector mvLX to the predicted image generator 101 (see fig. 14) in association with the reference image index refIdx and the vector index mvp _ LX _ idx, and stores the result in the prediction parameter memory 108.
In modification E4, an internal memory 1091 (not shown) may be provided between the reference image memory 109 and the predicted image generating unit 101. The prediction parameters corresponding to the reference area are stored in the internal memory 1091 from the prediction parameter memory 108, and the predicted image generating unit 101 reads the reference image block located at the position indicated by the displacement vector from the internal memory 1091. For example, by using a storage medium (flash memory) having a smaller storage capacity and a higher access speed than the reference image memory 109 as the internal memory 1091, it is possible to achieve higher speed and lower cost of the processing.
Thus, in modification E4, the range in which the value of the displacement vector to be output to the predicted image generating unit 101 can be reduced. This range corresponds to a reference area that is a range to be referred to by the predicted image generation unit 101 when referring to a reference image from the reference image memory 109 for a certain block to be encoded. That is, in modification E4, by reducing the reference area in the reference image memory 109, it is possible to use the reference image memory 109 as a reference image memory 109 even for a small-capacity storage medium, and to realize a high-speed processing.
(modification E5)
Next, an image encoding device 11E according to modification E5 will be described. Modification E5 is a modification corresponding to modification D5 described above. The same reference numerals are given to the same structures, and the above description is applied.
The image encoding device 11E according to modification E5 includes an AMVP prediction parameter derivation unit 1122E instead of the AMVP prediction parameter derivation unit 1122 (see fig. 15). Note that the subtracting unit 1123a (see modification E1) described later may be included instead of the subtracting unit 1123, or the subtracting unit 1123a described later may not be included. When the subtraction unit 1123a is not included, the displacement vector limiter 1124 (see fig. 15) may be omitted.
The AMVP prediction parameter derivation unit 1122e has the same configuration as the AMVP prediction parameter derivation unit 1122 (see fig. 15). Here, when the vector mvLX input from the coding parameter determination unit 110 is an outer layer prediction (displacement vector), the AMVP prediction parameter derivation unit 1122 restricts the input displacement vector dvLX to a value within a predetermined range, as in the displacement vector limiter unit 30338 (see fig. 11). Similarly to the reference layer determination unit 30311 (see modification D1 and fig. 4), the determination of whether the vector mvLX is a displacement vector or a motion vector is performed based on the reference image index refIdx.
Thus, the AMVP prediction parameter derivation unit 1122e restricts the reference area for reading the prediction parameters from the prediction parameter memory 108. This is because the reference area corresponds to a range in which the value of the displacement vector dvLX can be taken as the starting point of the encoding target block.
That is, in modification E5, by reducing the reference area in the prediction parameter memory 108, even if the storage medium has a small capacity, the use as the prediction parameter memory 108 is allowed, and the processing can be speeded up. In modification E5, an internal memory 1081 (not shown) may be provided, and the prediction parameters in the reference area may be stored in the slave prediction parameter memory 108. The AMVP prediction parameter derivation unit 1122e reads the prediction parameters via the internal memory 1081. The internal memory 1081 may be a storage medium having a smaller storage capacity and a faster access speed than the predicted parameter memory 108.
The image encoding device 11E according to the modification E5 includes a merged prediction parameter derivation unit 1121E instead of the merged prediction parameter derivation unit 1121 (see fig. 15).
The merged prediction parameter derivation unit 1121e has the same configuration as the merged prediction parameter derivation unit 1121 (see fig. 15). Here, when reading the prediction parameters of a layer image (for example, the base view) of a layer different from the target layer image using the displacement vector, the merged prediction parameter deriving unit 1121e limits the displacement vector dvLX to a value within a predetermined range in the displacement vector limiter unit 30338 (see fig. 13).
Thus, the merged prediction parameter deriving unit 1121e restricts the reference region from which the prediction parameters of the layer image other than the target layer image are read out from the prediction parameter memory 108, thereby increasing the speed of the processing. In modification E5, an internal memory 1082 (not shown) may be provided, and the prediction parameters in the reference area may be stored in the slave prediction parameter memory 108. Then, the merged prediction parameter deriving unit 1121e reads the prediction parameters via the internal memory 1082. The internal memory 1082 may be a storage medium having a smaller storage capacity and a faster access speed than the predicted parameter memory 108.
In the above-described embodiment, the image transmission system 1 (see fig. 1) may include one of the above-described image encoding devices 11a to 11e and the corresponding image decoding devices 31a to 31e instead of the image encoding device 11 and the image decoding device 31.
In the above-described embodiment, the image transmission system 1 may omit the image display device 41. In the above-described embodiment, an image display system including one of the image decoding apparatuses 31 to 31e and the image display apparatus 41 may be configured.
In this way, in the above-described embodiment, the displacement vector indicating the part of the displacement and the other part of the displacement is generated based on the sign indicating the part of the displacement between the first layer image and the second layer image different from the first layer image. Further, a reference image of a region indicated by the generated displacement vector is read from a reference image storage unit that stores the reference image, and a predicted image is generated based on the read reference image.
Therefore, it is not necessary to encode another part of the disparity, for example, the Y component. In addition, when a part of the disparity, for example, the X component, is dominant, there is no deterioration in accuracy in disparity prediction even if another part of the disparity is not encoded. Therefore, deterioration in the quality of the generated image is suppressed and the encoding efficiency is improved.
Further, in the above-described embodiment, a displacement vector indicating a displacement between a first layer image and a second layer image different from the first layer image is generated based on a sign indicating the displacement, and the displacement vector is limited to a value within a predetermined range. Further, a reference image of a region indicated by the generated displacement vector is read from a reference image storage unit that stores the reference image, and a predicted image is generated based on the read reference image.
Therefore, since deterioration in the quality of the generated image is suppressed and the region from which the reference image is read is limited, the processing at the time of referring to the reference image can be speeded up. In addition, since the number of symbols of the prediction image itself is reduced, the coding efficiency is improved.
(modification E6)
Next, an image encoding device 11f (not shown) according to modification E6 will be described. Modification E6 is a modification corresponding to modification D6 described above. The same reference numerals are given to the same structures, and the above description is applied.
The image encoding device 11f according to modification E6 includes an entropy encoding unit 104f instead of the entropy encoding unit 104.
Fig. 34 is a block diagram showing the configuration of the entropy encoding unit 104f according to modification E6.
The entropy encoding unit 104f includes an arithmetic encoding unit 1041, a reference layer determination unit 1042f, and a vector difference syntax encoding unit 1043 f. The reference layer determination unit 1042f and the vector difference syntax encoding unit 1043f may have the same configuration as the reference layer determination unit 30371 and the vector difference syntax decoding unit 3013f (see fig. 25).
The arithmetic coding unit 1041 includes a context record updating unit 10411 and a bit coding unit 10412. The configuration of the context record update unit 10411 may be the same as that of the context record update unit 30111 (see fig. 25).
The bit encoding unit 10412 refers to the context variable CV recorded in the context record updating unit 10411, and encodes each Bin constituting the difference vector mvdLX supplied from the external prediction parameter encoding unit 112. The encoded value of Bin is also supplied to the context record update unit 10411, and is referred to for updating the context variable CV.
The vector difference syntax coding unit 1043f derives a context index ctxIdx for coding each Bin, which is a syntax element constituting the difference vector mvdLX, from the difference vector mvdLX supplied from the external prediction parameter coding unit 112, abs _ mvd _ greater0_ flag [ XY ], abs _ mvd _ greater1_ flag [ XY ], abs _ mvd _ minus2[ XY ], and records the derived context index ctxIdx in the context record updating unit 10411. Therefore, using the stored context index ctxIdx, the bit encoding unit 10412 encodes each syntax element. The vector difference syntax encoding unit 1043f may use the derivation table information shown in any one of fig. 27 to 31 when deriving the context index ctxIdx.
In the above-described embodiment, by deriving the context of the syntax element abs _ mvd _ greater0_ flag [ XY ] indicating whether the absolute value of the difference vector exceeds 0, based on the reference picture index refIdx and whether the target component is the X component or the Y component, the amount of information of the syntax element to be encoded is reduced, and the encoding efficiency is improved.
(modification E7)
Next, the image encoding device 11g (not shown) according to modification E7 will be described. Modification E7 is a modification corresponding to modification D7 described above. The same reference numerals are given to the same structures, and the above description is applied.
The image encoding device 11g according to modification E7 includes an entropy encoding unit 104g instead of the entropy encoding unit 104.
Fig. 35 is a block diagram showing the configuration of the entropy encoding unit 104g according to modification E7.
The entropy encoding unit 104g includes an arithmetic encoding unit 1041, a vector difference syntax encoding unit 1043g, and a target layer determination unit 1044 g.
The arithmetic coding unit 1041, the vector difference syntax coding unit 1043g, and the target layer determination unit 1044g may have the same configurations as those of the arithmetic coding unit 1041 (see fig. 34), the vector difference syntax decoding unit 3013g (see fig. 32), and the target layer determination unit 30171 (see fig. 32), respectively.
Here, the vector difference syntax coding unit 1043g derives a context index ctxIdxInc for coding each Bin of abs _ mvd _ header 0_ flag [ XY ], abs _ mvd _ header 1_ flag [ XY ], and abs _ mvd _ minus2[ XY ] which are syntax elements constituting the vector difference mvlx supplied from the external prediction parameter coding unit 112. The vector difference syntax encoding unit 1043g stores the derived context index ctxIdxInc in the context record update unit 10411. The bit encoding unit 10412 encodes each syntax element constituting the differential vector mvdLX using the context index ctxIdxInc stored in the context record update unit 10411. The bit encoding unit 10412 outputs each of the encoded syntax elements as a part of the bit stream Te.
Thus, in the present modification, by deriving the context of the flag abs _ mvd _ greater0_ flag [ XY ] indicating whether the absolute value of the difference vector exceeds 0, based on the reference picture index refIdx and whether the target component is the X component or the Y component, the information amount of the syntax element that has been encoded is reduced, and the encoding efficiency is improved.
(modification E8)
Next, the image encoding device 11h (not shown) according to modification E8 will be described. Modification E8 is a modification corresponding to modification D8 described above. The same reference numerals are given to the same structures, and the above description is applied.
The image encoding device 11h according to the modification E8 includes a merging prediction parameter derivation unit 1121h and an AMVP prediction parameter derivation unit 1122h, instead of the merging prediction parameter derivation unit 1121 and the AMVP prediction parameter derivation unit 1122 (see fig. 15), respectively.
Fig. 43 is a block diagram showing the configuration of the AMVP prediction parameter derivation unit 1122h according to modification E8.
The AMVP prediction parameter derivation unit 1122h includes a vector candidate derivation unit 1127, a prediction vector selection unit 1128, and a prediction parameter reference unit 1129. The vector candidate derivation unit 1127 may have the same configuration as the vector candidate derivation unit 3033, the prediction vector selection unit 3034, and the prediction parameter reference unit 3039 (see fig. 36). The prediction parameter reference unit 1129 refers to the prediction parameters from the prediction parameter memory 108, in the same manner as the prediction parameter reference unit 3039 refers to the prediction parameters from the prediction parameter memory 307 (see fig. 36).
Fig. 44 is a block diagram showing the configuration of the merged prediction parameter deriving unit 1121h according to the modification example E8.
The merging prediction parameter derivation unit 1121h includes a merging candidate derivation unit 11211, a merging candidate selection unit 11212, and a prediction parameter reference unit 11213. The merge candidate derivation unit 11211, the merge candidate selection unit 11212, and the prediction parameter reference unit 11213 may have the same configurations as the merge candidate derivation unit 30364, the merge candidate selection unit 30365, and the prediction parameter reference unit 3039, respectively (see fig. 39). The prediction parameter reference unit 11213 refers to the prediction parameters from the prediction parameter memory 108, in the same manner as the prediction parameter reference unit 3039 refers to the prediction parameters from the prediction parameter memory 307 (see fig. 39).
In the present modification, in the external layer prediction by the prediction parameter referencing units 11213 and 1129, the prediction parameters are referenced from the prediction parameter memory for each block having a unit of a predetermined size. Thus, the prediction parameter memory is not referred to in small units of 4 × 4 units (or 4 × 8 and 8 × 4 units), and therefore, the memory amount of the prediction parameter memory is reduced.
The present invention can also be expressed as follows.
(1) An aspect of the present invention is an image decoding apparatus including: a displacement vector generation unit that generates a displacement vector indicating a part of the displacement and another part of the displacement based on a symbol indicating a part of the displacement between a first layer image and a second layer image different from the first layer image; a reference image storage unit for storing a reference image; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit from the reference image storage unit, and generates a predicted image based on the read reference image.
(2) Another aspect of the present invention may be the image decoding apparatus described above, wherein a part of the displacement is a horizontal component of the displacement, and another part of the displacement is a vertical component of the displacement. The displacement vector generator may be configured to determine the vertical component, the prediction value of the vertical component, or the prediction residual of the vertical component to a predetermined value.
(3) Another aspect of the present invention may be the image decoding apparatus described above, wherein a part of the displacement is a horizontal component of the displacement, and another part of the displacement is a vertical component of the displacement. The displacement vector generator may be configured to calculate the vertical component of the displacement based on a sign indicating a relationship between the vertical component and the horizontal direction.
(4) Another aspect of the present invention is an image decoding apparatus including: a displacement vector generation unit that generates a displacement vector indicating a displacement between a first layer image and a second layer image different from the first layer image, based on a symbol indicating the displacement; a displacement vector limiting unit that limits the displacement vector to a value within a predetermined range; a reference image storage unit for storing a reference image; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit from the reference image storage unit and generates a predicted image based on the read reference image
(5) In another aspect of the present invention, in the above-described image decoding device, the displacement vector limiting unit limits the range of values of the displacement vector such that the range of vertical components is smaller than the range of horizontal components.
(6) Another aspect of the present invention may be the image decoding apparatus described above, including: a prediction parameter storage unit that stores the derived prediction parameters for each region of the image; and a prediction parameter deriving unit configured to refer to the prediction parameter relating to the area indicated by the displacement vector restricted by the displacement vector restricting unit among the prediction parameters stored in the prediction parameter storage unit, and to derive a prediction vector that is a prediction value of the motion vector or the displacement vector as at least a part of the prediction parameter.
(7) Another aspect of the present invention is an image encoding device including: a displacement vector generation unit that generates a displacement vector indicating a part of the displacement and another part of the displacement based on a part of the displacement between a first layer image and a second layer image different from the first layer image; a reference image storage unit for storing a reference image; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit from the reference image storage unit, and generates a predicted image based on the read reference image.
(8) Another aspect of the present invention may be the image encoding device described above, wherein a part of the displacement is a horizontal component of the displacement, and another part of the displacement is a vertical component of the displacement. The displacement vector generator may be configured to determine the vertical component, the prediction value of the vertical component, or the prediction residual of the vertical component to a predetermined value.
(9) Another aspect of the present invention may be the image encoding device described above, wherein a part of the displacement is a horizontal component of the displacement, and another part of the displacement is a vertical component of the displacement. The displacement vector generator may further include an encoder that calculates a vertical component of the displacement based on a relationship between the vertical component and a horizontal direction, and encodes the relationship between the vertical component and the horizontal direction.
(10) Another aspect of the present invention is an image encoding device including: a displacement vector generation unit that generates a displacement vector indicating a displacement between a first layer image and a second layer image different from the first layer image, based on a symbol indicating the displacement; a displacement vector limiting unit that limits the displacement vector to a value within a predetermined range; a reference image storage unit for storing a reference image; and a predicted image generation unit that reads a reference image of the region indicated by the displacement vector generated by the displacement vector generation unit from the reference image storage unit and generates a predicted image based on the read reference image
(11) In another aspect of the present invention, in the above-described image encoding device, the displacement vector restricting unit may restrict a range of values of the displacement vector such that a range of vertical components is smaller than a range of horizontal components.
(12) Another aspect of the present invention may be the image encoding device described above, including: a prediction parameter storage unit that stores the derived prediction parameters for each region of the image; and a prediction parameter deriving unit configured to refer to the prediction parameter relating to the area indicated by the displacement vector restricted by the displacement vector restricting unit among the prediction parameters stored in the prediction parameter storage unit, and to derive a prediction vector that is a prediction value of the displacement vector as at least a part of the prediction parameter.
(13) Another aspect of the present invention is an image decoding apparatus including: a vector difference decoding unit which derives a context of the arithmetic symbol and decodes the vector difference from the encoded data; a vector derivation unit that derives a vector of the target block from a sum of the vector of the processed block and the vector difference; a reference image storage unit for storing a reference image; a predicted image generation unit that reads a reference image of a region indicated by the vector of the target block generated by the vector derivation unit from the reference image storage unit and generates a predicted image based on the read reference image; and a reference layer determination unit configured to determine whether the vector or the vector difference of the target block is predicted between different layers, wherein the vector difference decoding unit is configured to allocate a context based on whether the reference layer determination unit determines that the vector or the vector difference is predicted between different layers.
(14) In another aspect of the present invention, in the above-described image decoding device, the vector difference decoding unit may be configured to, when the reference layer determination unit determines that the inter-layer prediction is different, assign different contexts to a syntax element constituting a vertical component of the vector difference and a syntax element constituting a horizontal component of the vector difference.
(15) In another aspect of the present invention, in the above-described image decoding apparatus, the vector difference decoding unit may be configured to assign the respectively different contexts to at least syntax elements constituting one component of the vector difference, when the reference layer determination unit determines that the inter-layer prediction is different from the reference layer determination unit, and when the reference layer determination unit does not determine that the inter-layer prediction is different from the reference layer determination unit.
(16) Another aspect of the present invention may be the above-described image decoding apparatus, wherein the syntax element is information indicating whether or not an absolute value of the vector difference exceeds 0.
(17) In another aspect of the present invention, in the above-described image decoding apparatus, the reference layer determination unit may be configured to determine that the inter-layer prediction is different using a time, a view identifier, or whether a long-time reference picture is different between a target picture to which the target block belongs and a reference picture to which the reference picture belongs.
(18) In another aspect of the present invention, in the above-described image decoding apparatus, the reference layer determination unit may be configured to determine that the target image belongs to the different inter-layer prediction based on whether the target image is not an image related to the reference layer or not an image related to the reference view.
(19) Another aspect of the present invention is an image encoding device including: a vector difference encoding unit that derives a context of the arithmetic sign and encodes the vector difference; a vector difference derivation unit that derives the vector difference from the vector of the processed block and the vector of the target block; a reference image storage unit for storing a reference image; a predicted image generation unit that reads a reference image of a region indicated by a vector of the target block from the reference image storage unit and generates a predicted image based on the read reference image; and a reference layer determination unit configured to determine whether a vector or a vector difference is prediction between different layers, wherein the vector difference encoding unit assigns a context based on whether the reference layer determination unit determines that the vector or the vector difference is prediction between different layers.
(20) Another aspect of the present invention is an image decoding apparatus including: a prediction parameter storage unit that stores the derived prediction parameters for each predetermined region of the image; a prediction parameter derivation unit that derives a prediction parameter or a prediction vector of a target prediction block; a reference image storage unit for storing a reference image; and a prediction image generation unit configured to read a reference image of a region indicated by the vector derived by the prediction parameter derivation unit from the reference image storage unit and generate a prediction image based on the read reference image, wherein the prediction parameter derivation unit includes a prediction parameter reference unit configured to refer to the prediction parameter stored in the prediction parameter storage unit, and the prediction parameter reference unit includes an external layer reference address conversion unit configured to convert coordinates of the prediction parameter of the reference block when a target image belonging to the target prediction block and a reference image belonging to a reference block that is a part of the reference image belong to different layers.
(21) In another aspect of the present invention, in the above-described image decoding device, the external-layer reference address conversion unit may be configured such that the operation of converting the coordinates includes an operation of discretizing the coordinates in large units.
(22) Another aspect of the present invention is an image encoding device including: a prediction parameter storage unit that stores the derived prediction parameters for each predetermined region of the image; a prediction parameter derivation unit that derives a prediction parameter or a prediction vector of a target prediction block; a reference image storage unit for storing a reference image; and a prediction image generation unit configured to read a reference image of a region indicated by the vector derived by the prediction parameter derivation unit from the reference image storage unit and generate a prediction image based on the read reference image, wherein the prediction parameter derivation unit includes a prediction parameter reference unit configured to refer to the prediction parameter stored in the prediction parameter storage unit, and further the prediction parameter reference unit includes an external layer reference address conversion unit configured to convert coordinates of the prediction parameter of the reference block when a target image to which the target prediction block belongs and a reference image to which a reference block that is a part of the reference image belongs belong are different layers.
Further, the image encoding devices 11 to 11h and the image decoding devices 31 to 31h in the above-described embodiments may be partially implemented by a computer, for example, the entropy decoding unit 301, the prediction parameter decoding unit 302, the predicted image generating unit 101, the DCT/quantizing unit 103, the entropy encoding unit 104, the inverse quantization/inverse DCT unit 105, the encoding parameter determining unit 110, the prediction parameter encoding unit 111, the entropy decoding unit 301, the prediction parameter decoding unit 302, the predicted image generating unit 308, and the inverse quantization/inverse DCT unit 311. In this case, a program for realizing the control function may be recorded in a computer-readable recording medium, and the program recorded in the recording medium may be read and executed by a computer system. The "computer system" referred to herein is a computer system incorporated in one of the image encoding devices 11 to 11h and the image decoding devices 31 to 31h, and includes hardware such as an OS and peripheral devices. The "computer-readable recording medium" refers to a storage device such as a removable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, or a hard disk built in a computer system. The "computer-readable recording medium" may include a medium that dynamically holds a program for a short period of time, such as a communication line when the program is transmitted via a network such as the internet or a communication line such as a telephone line, or may include a medium that holds a program for a certain period of time, such as a volatile memory in a computer system serving as a server or a client at that time. The program may be used to realize a part of the above-described functions, or may be combined with a program already recorded in a computer system to realize the above-described functions.
In addition, a part or all of the image encoding devices 11 to 11e and the image decoding devices 31 to 31e in the above-described embodiments may be implemented as an integrated circuit such as an LSI (Large Scale Integration). The functional blocks of the image encoding apparatuses 11 to 11e and the image decoding apparatuses 31 to 31e may be individually processed, or may be partially or entirely integrated and processed. The method of forming an integrated circuit is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. In addition, when a technique for realizing an integrated circuit instead of an LSI has been developed with the advance of semiconductor technology, an integrated circuit based on the technique may be used.
While one embodiment of the present invention has been described in detail with reference to the drawings, the specific configuration is not limited to the above-described embodiment, and various design changes and the like can be made without departing from the scope of the present invention.
Industrial applicability
The present invention is applicable to an apparatus for encoding images from a plurality of viewpoints, an apparatus for decoding encoded images, and the like.
Description of the reference numerals
1 image transmission system
11. 11a-11h image coding device
101 predicted image generating part
102 subtraction part
103 DCT/quantization section
104. 104f, 104g entropy coding unit
1041 arithmetic coding unit
10411 context record updating unit
10412 bit encoding unit
1042f reference layer determining part
1043f, 1043g vector difference syntax coding unit
1044g target layer determination unit
105 inverse quantization/inverse DCT unit
106 addition unit
108 prediction parameter memory
109 reference picture memory
110 encoding parameter determining section
111 prediction parameter encoding unit
112. 112c external prediction parameter encoding unit
1121. 1121e, 1122h merge prediction parameter derivation unit
11213 reference part of prediction parameter
11211 merging candidate derivation unit
11212 merging candidate selection unit
1122. 1122h AMVP prediction parameter derivation unit
1123. 1123a subtraction part
1124. 1124a, 1124d displacement vector limiter
1125 displacement vector generator
1126 prediction parameter merging unit
1127 vector candidate derivation unit
1128 prediction vector selecting unit
1129 prediction parameter reference unit
113 intra prediction parameter encoding unit
21 network
31. 31a-31h image decoding device
301. 301f, 301g entropy decoding unit
3011 arithmetic symbol decoding unit
30111 context record update unit
30112 bit decoding unit
3012 reference index decoding unit
3013. 3013f, 3013g vector difference syntax decoding unit
3014 layer ID decoding unit
30171 target layer determination unit
302 prediction parameter decoding unit
303. 303c, 303d external prediction parameter decoding unit
3031 external prediction parameter extraction unit
30311 reference layer determination unit
30312 vector differential decoding unit
3032. 3032a, 3032h AMVP prediction parameter derivation part
30321a disparity prediction vector limiter
3033 vector candidate derivation unit
30335. 30335e extended vector candidate derivation section
30336 Displacement vector acquisition Unit
30337 outer layer vector candidate derivation section
30338 displacement vector limiter
30339 vector candidate storage unit
3034 prediction vector selecting part
3035 addition part
3036. 3036e, 3036h merging prediction parameter derivation parts
30360 expanding merge candidate derivation section
30361 Displacement vector acquiring Unit
30363 external layer merging candidate derivation unit
30364 merging candidate derivation sections
30365 merging candidate selection parts
3037 displacement vector limiter
30371 reference layer determination section
30372 vector limiter section
3038 Displacement vector Generation part
30381 reference layer determining section
30382 Displacement vector setting section
3039 prediction parameter reference unit
30391 spatial prediction reference address translation unit
30392 time prediction reference address conversion unit
30393 external layer reference address conversion unit
304 intra prediction parameter decoding unit
306 reference picture memory
3061 internal memory
307 prediction parameter memory
3071 memory for CTB prediction parameters of objects
3072 left CTB column prediction parameter memory
3073 CTB line prediction parameter memory
3074 time prediction parameter memory
3075 external layer prediction parameter memory
3076 internal memory
308 predicted image generating part
309 external prediction image generating section
3091 displacement prediction image generating unit
310 intra prediction image generation unit
311 inverse quantization/inverse DCT unit
312 addition unit
41 image display device

Claims (3)

1. An image decoding apparatus comprising:
a prediction parameter derivation unit configured to derive a prediction parameter of the target block; and
a predicted image generation unit that generates a predicted image based on the derived prediction parameter,
the prediction parameter deriving unit uses calculations (xP >3) < <3 and (yP >3) < < 3) for the coordinates (xP, yP) of a reference block that is a part of the reference image when the target image and the reference image belong to different layers,
in the case of temporal prediction, the prediction parameter derivation unit uses calculations (xP > >4) < <4 and (yP > >4) < < 4) for the coordinates (xP, yP) of the reference block.
2. An image encoding device includes;
a prediction parameter derivation unit which stores a prediction parameter of the target block; and
a predicted image generation unit that generates a predicted image based on the prediction parameter,
the prediction parameter deriving unit uses calculations (xP >3) < <3 and (yP >3) < < 3) for the coordinates (xP, yP) of a reference block that is a part of the reference image when the target image and the reference image belong to different layers,
in the case of temporal prediction, the prediction parameter derivation unit uses calculations (xP > >4) < <4 and (yP > >4) < < 4) for the coordinates (xP, yP) of the reference block.
3. An image decoding method comprising at least the steps of:
deriving a prediction parameter of the target block; and
a step of generating a prediction image based on the derived prediction parameter,
the step of deriving the prediction parameter uses calculations (xP >3) < <3 and (yP >3) < < 3) for the coordinates (xP, yP) of a reference block that is a part of the reference image when the target image and the reference image belong to different layers,
in the case of temporal prediction, the step of deriving the prediction parameters uses operations (xP > >4) < <4 and (yP > >4) < < 4) on the coordinates (xP, yP) of the reference block.
HK15108848.0A 2012-09-28 2013-09-26 Image decoding device, image encoding device and image decoding method HK1208294B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2012217904 2012-09-28
JP2012-217904 2012-09-28
PCT/JP2013/076016 WO2014050948A1 (en) 2012-09-28 2013-09-26 Image decoding device and image encoding device

Publications (2)

Publication Number Publication Date
HK1208294A1 HK1208294A1 (en) 2016-02-26
HK1208294B true HK1208294B (en) 2018-11-30

Family

ID=

Similar Documents

Publication Publication Date Title
ES3008269T3 (en) Improvements on history-based motion vector predictor
JP6225241B2 (en) Image decoding apparatus, image decoding method, image coding apparatus and image coding method
JP6535673B2 (en) Video coding techniques using asymmetric motion segmentation
CN105874799B (en) Block-based Advanced Residual Prediction for 3D Video Decoding
JP6469588B2 (en) Residual prediction device, image decoding device, image coding device, residual prediction method, image decoding method, and image coding method
CN104956678B (en) For advanced merging/skip mode of 3 D video and advanced motion vector forecasting (AMVP) pattern
EP3090555B1 (en) Disparity vector and/or advanced residual prediction for video coding
EP3854076A1 (en) Affine motion prediction
CN114503584A (en) History-Based Motion Vector Prediction
WO2016125685A1 (en) Image decoding device, image encoding device, and prediction vector deriving device
WO2015192286A1 (en) Simplified shifting merge candidate and merge list derivation in 3d-hevc
CN109547800B (en) Simplified advanced residual prediction for 3D-HEVC
HK1208294B (en) Image decoding device, image encoding device and image decoding method
HK40053272A (en) Improvements on history-based motion vector predictor
HK1223757B (en) Block-based advanced residual prediction for 3d video coding