US20120300843A1 - Block-based interleaving - Google Patents
Block-based interleaving Download PDFInfo
- Publication number
- US20120300843A1 US20120300843A1 US13/575,803 US201113575803A US2012300843A1 US 20120300843 A1 US20120300843 A1 US 20120300843A1 US 201113575803 A US201113575803 A US 201113575803A US 2012300843 A1 US2012300843 A1 US 2012300843A1
- Authority
- US
- United States
- Prior art keywords
- image
- block
- multiple blocks
- image block
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/172—Processing image signals image signals comprising non-image signal components, e.g. headers or format information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/194—Transmission of image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- Implementations are described that relate to image compression.
- Various particular implementations relate to compression of interleaved images, and the interleaved images may be formed of images having overlapping content.
- AVC which refers to the existing International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation
- ISO/IEC International Organization for Standardization/International Electrotechnical Commission
- MPEG-4 Moving Picture Experts Group-4
- AVC Advanced Video Coding
- ITU-T International Telecommunication Union, Telecommunication Sector
- H.264/MPEG-4 AVC Standard or variations thereof, such as the “AVC standard”, the “H.264 standard”, or simply “AVC” or “H.264”
- I H.264/MPEG-4 AVC Standard
- “I” frames are typically compressed in AVC using intra-coding techniques.
- a first image that includes multiple blocks is accessed.
- the multiple blocks of the first image include a first-image block.
- a second image that includes multiple blocks is accessed.
- the multiple blocks of the second image include a second-image block that has overlapping content with the first-image block.
- the multiple blocks of the first image and the multiple blocks of the second image are interleaved on a block basis to form an interleaved image. At least a portion of the interleaved image is encoded by encoding the first-image block using the second-image block as a reference.
- the encoded first-image block is provided for transmission or storage.
- a video signal or a video signal structure includes one or more picture portions for an encoding.
- the encoding is an encoding of a block-based interleaving of multiple blocks of a first image and multiple blocks of a second image.
- the multiple blocks of the first image include a first-image block
- the multiple blocks of the second image include a second-image block that has overlapping content with the first-image block.
- the encoding of the first-image block uses the second-image block as a reference.
- an encoded image is accessed.
- the encoded image is an encoding of a block-based interleaving of multiple blocks of a first image and multiple blocks of a second image.
- the multiple blocks of the first image include a first-image block
- the multiple blocks of the second image include a second-image block that has overlapping content with the first-image block.
- a portion of the encoded image is decoded.
- the encoded image portion encodes the first-image block using the second-image block as a reference.
- the decoded portion is provided for processing or display.
- implementations may be configured or embodied in various manners.
- an implementation may be performed as a method, or embodied as an apparatus, such as, for example, an apparatus configured to perform a set of operations or an apparatus storing instructions for performing a set of operations, or embodied in a signal.
- an apparatus such as, for example, an apparatus configured to perform a set of operations or an apparatus storing instructions for performing a set of operations, or embodied in a signal.
- FIG. 1 is a block/flow diagram depicting an example of a system and process for encoding and decoding images that may be used with one or more implementations.
- FIG. 2 is a block diagram depicting examples of neighboring blocks that may be used with one or more implementations.
- FIG. 3 is a block diagram depicting examples of neighboring reference blocks that may be used with one or more implementations.
- FIG. 4 is a block/flow diagram depicting examples of vertical interleaving and horizontal interleaving that may be used with one or more implementations.
- FIG. 5 is a flow diagram depicting an example of an encoding process that may be used with one or more implementations.
- FIG. 6 is a flow diagram depicting an example of a decoding process that may be used with one or more implementations.
- FIG. 7 is a block/flow diagram depicting an example of an encoding system that may be used with one or more implementations.
- FIG. 8 is a block/flow diagram depicting an example of a decoding system that may be used with one or more implementations.
- FIG. 9 is a block/flow diagram depicting an example of a video transmission system that may be used with one or more implementations.
- FIG. 10 is a block/flow diagram depicting an example of a video receiving system that may be used with one or more implementations.
- At least one implementation described in this application seeks to improve the efficiency of compressing a stereo image pairs that has been merged into a single image.
- the implementation rearranges the stereo image pair in a way that allows the H.264 compression algorithm to take better advantage of intra block prediction.
- the left view and right view pictures of the stereo image pair are interleaved at the macroblock level.
- the left view and right view pictures are encoded together as a single picture, and the interleaved picture arrangement typically improves intra prediction efficiency versus typical horizontal or vertical split screen arrangements.
- I-picture compression In block based compression algorithms (for example MPEG2, MPEG4), the inventors have determined that a disproportionate percentage of the total bit budget allocated to a compressed stream are spent on I-picture compression. Note that I pictures are often used as reference pictures. In the near term, broadcast 3D video is likely to rely on a split screen approach to deliver a left/right stereo image pair. A typical arrangement is a left and right picture, each horizontally sub-sampled by half, concatenated to form a single full size composite left+right picture.
- Horizontal sub-sampling and vertical sub-sampling are both used in current generation half resolution 3D encoders. Typically, horizontal sub-sampling is used for 1920 ⁇ 1080 source material, and vertical sub-sampling is used for 1280 ⁇ 720p source material.
- MVC multi-view video coding
- MVC extension the “MVC extension” or simply “MVC”.
- MVC is non-backward compatible compression algorithm, which is an extension of the H.264/MPEG-4 AVC standard that has been developed to take advantage of, for example, the redundancy between left and right views in a stereo image pair.
- the system 100 includes an encoding block 110 , a decoding block 120 , and a transmission operation 130 that links the encoding block 110 and the decoding block 120 .
- Full resolution input pictures for a stereo-image pair are provided as input to the encoding block 110 .
- the full resolution stereo images include a left view picture 140 and a right view picture 142 .
- the full resolution images are down-sampled in the horizontal dimension by 1 ⁇ 2 to reduce the original horizontal size by 1 ⁇ 2. This results in a horizontal sample rate conversion (“SRC”) to 1 ⁇ 2 the original horizontal size. Down-sampling is also referred to as sub-sampling, rate converting, or down-scaling.
- the encoding block 110 includes a sampler 144 that down-samples the left view picture 140 , and a sampler 146 that down-samples the right view picture 142 .
- the sampler 144 produces a sampled left view picture 148 that is 1 ⁇ 2 the size of the left view picture 140 in the horizontal dimension.
- the sampler 146 produces a sampled right view picture 150 that is 1 ⁇ 2 the size of the right view picture 142 in the horizontal dimension.
- the sampled left view picture 148 and the sampled right view picture 150 are interleaved to form an interleaved composite picture 152 .
- the composite picture 152 is formed by decomposing (also referred to as partitioning or dividing) the sampled left view picture 148 into 16 ⁇ 16 macroblocks, decomposing the sampled right view picture 150 into 16 ⁇ 16 macroblocks, and interleaving the macroblocks from the left view picture 148 and the right view picture 150 to form the composite picture 152 .
- the macroblocks are interleaved on an alternating basis in a column-by-column format, as explained further with respect to FIG. 4 below.
- the encoding block 110 includes an H.264 encoder 154 that encodes the composite picture 152 .
- the composite picture 152 is encoded using HP@L4.x to form an encoded picture (not shown).
- HP@L4.x refers to High Profile, level 4.x, which includes 4.0, 4.1, and 4.2.
- other implementations use any of several of the H.264 coding profiles, such as, for example, all levels of the Baseline Profile, all levels of the Main Profile, and all levels of the High Profile.
- the encoder 154 encodes the composite picture 152 as an I picture, and uses H.264 intra-coding modes. Accordingly, the blocks of the composite picture 152 are encoded using one or more other blocks from the composite picture 152 as a reference.
- a predictor for a given block may be formed from a combination of neighboring blocks.
- a neighboring block of a given block is commonly defined to be one of the eight blocks that touches the given block on a corner or edge. Referring to FIG. 2 , the eight neighboring blocks of a middle block M are shown as blocks 1 - 8 . Note that for purposes of H.264 Intra prediction modes, blocks 1 , 2 , 3 , 4 and 6 are generally allowed as predictors.
- various implementations form the predictor for a given block (block G) from a combination of blocks lying horizontally to the left (block A), vertically above (block B), and diagonally to the right and above (block C). Because the composite picture 152 uses column-by-column interleaving, it should be clear that block G will be from one of either the sampled left view picture 148 or the sampled right view picture 150 , and that blocks A and C will both be from the other sampled picture.
- Various implementations may form a predictor based on only one (rather than a combination) of blocks A, B, or C, or on other blocks including non-neighboring blocks.
- various implementations provide encoding modes that allow block G to be encoded with respect to block A alone, or with respect to block C alone.
- Such modes that code block G using only block A or block C, are expected to have increased coding efficiency by using the interleaved composite picture 152 , as compared to using a split screen format that is not interleaved (see horizontal split screen picture 160 described below).
- the increased efficiency is expected to arise, at least in part, from being able to encode a block from one view (left or right) using a corresponding block from the other view. If the corresponding blocks are aligned well, then the residue will be small and will require fewer bits to encode. It is noted, however, that the alignment need not be perfect to reduce the residue and provide coding gains.
- blocks 1 , 2 , 3 , 4 , and 6 as shown in FIG. 2 may be used as predictors for block M in H.264 Intra prediction.
- Various implementations perform interleaving to take advantage of the fact that in a stereoscopic view there is expected to be horizontal displacement in the two pictures but not vertical displacement. The best predictor in such cases is expected to be to the corresponding block from the other stereoscopic view. That corresponding block will often be to the left of the block being coded after column-wise interleaving, and will often be above the block being coded after row-wise interleaving.
- Various implementations perform intra-coding of the composite picture 152 by searching within the composite picture 152 for the best reference block. More specifically, several such implementations search within a reconstruction of those portions of the current picture that have already been encoded. Because of the searching, such a mode is often more time-intensive and processor-intensive than merely using predetermined neighboring blocks as the references. However, such a mode typically offers the advantage of finding a better prediction of a given block. Such a mode also typically offers the advantage of finding a corresponding stereo-image block without needing to know the disparity.
- the encoding block 110 includes an H.264 decoder 156 that decodes the encoded picture to produce a decoded picture 158 .
- the encoded picture is decoded using HP@L4.x.
- the decoded picture 158 is a reconstruction of the composite picture 152 .
- the encoding block 110 deinterleaves the decoded picture 158 to form a horizontal split screen picture 160 .
- the horizontal split screen picture 160 includes a left picture reconstruction 162 of the sampled left view picture 148 , and includes a right picture reconstruction 164 of the sampled right view picture 150 .
- the horizontal split screen picture 160 is stored as a reference picture in a reference picture storage (not shown), and is available to be used as a reference picture by the encoding block 110 .
- P and B pictures are coded as horizontal split screen pictures. That is, for P and B pictures, the sampled left view picture 148 and the sampled right view picture 150 are formed into a horizontal split screen picture rather than an interleaved composite picture, and encoded by the encoder 154 . Reference pictures are also stored as horizontal split screen pictures, as indicated above. When P or B coded blocks contain motion references that point to the I picture, the motion estimation is extracted from the horizontal split screen reconstructed picture 160 .
- the encoding block 110 thus performs different operations for I blocks, as compared to P and B blocks. For example, for I blocks the encoding block 110 performs (i) interleaving before encoding and (ii) deinterleaving before forming a horizontal split screen reconstruction picture. As another example, for P and B blocks, the encoding block 110 forms a split screen picture before encoding.
- the encoder 154 also provides the encoded picture (not shown) to the transmission operation 130 for transmission.
- the transmitted picture is received by the decoding block 120 .
- the decoding block 120 includes an H.264 decoder 170 that performs an HP@L4.x decode of the received picture.
- the decoder 170 produces a reconstructed picture 172 that is a reconstruction of the composite picture 152 . Accordingly, the reconstructed picture 172 has macroblocks interleaved from a left image (the sampled left view picture 148 ) and a right image (the sampled right view picture 150 ).
- the decoder 170 will be the same as the decoder 156 .
- the decoding block 120 deinterleaves the reconstructed picture 172 to form a horizontal split screen picture 174 that includes a left picture reconstruction 176 and a right picture reconstruction 178 . If there are no errors in transmission or decoding, (i) the reconstructed picture 172 will match the decoded picture 158 from the encoding block 110 , (ii) the horizontal split screen picture 174 will match the horizontal split screen picture 160 , (iii) the left picture reconstruction 176 will match the left picture reconstruction 162 , and (iv) the right picture reconstruction 178 will match the right picture reconstruction 164 .
- the decoding block 120 includes a sampler 180 that performs horizontal sample rate conversion to recover the original horizontal size.
- the sampler 180 performs the conversion by upsampling the left picture reconstruction 176 to recover the original horizontal size of the left view picture 140 .
- the sampler 180 produces a reconstructed left view picture 184 which is a reconstruction of the left view picture 140 .
- Upsampling is also referred to as rate converting or up-scaling.
- the decoding block 120 includes a sampler 182 that performs horizontal sample rate conversion to recover the original horizontal size.
- the sampler 182 performs the conversion by upsampling the right picture reconstruction 178 to recover the original horizontal size of the right view picture 142 .
- the sampler 182 produces a reconstructed right view picture 186 which is a reconstruction of the right view picture 142 .
- the reconstructed left view picture 184 , and the reconstructed right view picture 186 are full resolution pictures ready for output to display.
- Other implementations also, or alternatively, provide the reconstructed left view picture 184 and/or the reconstructed right view picture 186 for processing.
- processing includes, for example, filtering, rendering further images, artifact reduction, color modification, edge sharpening, and/or object detection, and may be performed prior to display or in lieu of display.
- other implementations provide the horizontal split screen picture 174 as output for processing and/or display.
- the decoding block 120 also performs different operations for I blocks, as compared to P and B blocks. For example, for I blocks the decoding block 120 performs deinterleaving before forming the horizontal split screen picture 174 . In contrast, for P and B blocks, the output of the decoder 170 will be a horizontal split screen picture.
- the process of FIG. 1 is at least largely backward compatible with existing processes. Additionally, legacy H.264 encoders and decoders may be used. However, the process of FIG. 1 may not be completely backward compatible with all existing decode processes. Nonetheless, it is within the capability of many decoders to use integrated Blit (for example, a programmable bitmap graphics device; or a bit blit device performing bit-block image transfers, for example, to combine multiple bitmaps) or DMA capability to convert the macroblock interleaved I picture image into a left/right split screen image.
- integrated Blit for example, a programmable bitmap graphics device; or a bit blit device performing bit-block image transfers, for example, to combine multiple bitmaps
- DMA capability to convert the macroblock interleaved I picture image into a left/right split screen image.
- an existing H.264 decoder might not be configured to convert the decoded (interleaved picture) 158 into the horizontal split screen picture 160 , or to convert the reconstructed picture 172 into the horizontal split screen picture 174 .
- techniques for performing this conversion are viable and well within the ordinary skill in the art using, for example, technologies such as integrated Blit or DMA. Additionally, such technologies can be used to selectively create either an interleaved image (for example, the composite picture 152 ) or a split screen concatenated image to be used as input to an H.264 encoder.
- FIG. 4 depicts a left picture 410 and a right picture 420 of a stereo-image pair.
- the left picture 410 and the right picture 420 are assumed to have been downsampled in the horizontal direction by a factor of 2.
- These two pictures 410 and 420 are combined, as shown by arrow 425 , to form an interleaved picture 430 .
- the interleaved picture 430 is effectively a column-wise interleaving of the pictures 410 and 420 .
- an encoder encodes the interleaved picture 430 row-by-row, from left to right. Accordingly, as the encoder is encoding the interleaved picture 430 , it can be seen that when the encoder gets to the block labeled R 22 (circled in the interleaved picture 430 ), the encoder has already encoded the corresponding block L 22 (also circled in the interleaved picture 430 ) from the left picture, and has the encoding of L 22 available to use in encoding R 22 . L 22 is to the immediate left of R 22 in the interleaved picture 430 .
- L 22 and R 22 correspond as corresponding blocks in a stereo-image pair, and so their content is assumed to overlap considerably. Content overlaps when both blocks have some common content. Blocks share common content when, for example, both blocks include a particular object or background, even if that object or background is not in exactly the same relative position in each of the blocks.
- Identification of these corresponding blocks is based simply on the fact that L 22 and R 22 have corresponding locations in the two pictures 410 and 420 . That is, L 22 and R 22 are assumed to have the same (x,y) coordinates in their respective pictures 410 and 420 .
- Other implementations determine corresponding blocks based on, for example, disparity.
- disparity-based implementations a variety of disparity-based metrics may be used, such as, for example, the average disparity for the stereo-image pair.
- the average disparity of the picture 410 is determined to be equal to the horizontal size of a single block. Accordingly, the block L 12 of the picture 410 is determined to correspond to the block R 11 of the picture 420 .
- the interleaving may still be performed as in the interleaved picture 430 , or the interleaving may be based on the disparity.
- the blocks are interleaved as in the interleaved picture 430 .
- corresponding blocks may or may not be neighbors.
- those blocks would still be neighbors, as shown in the interleaved picture 430 .
- L 13 would correspond to R 11 , and those blocks would not be neighbors in the interleaved picture 430 .
- the blocks are interleaved based on the disparity. Therefore, if L 13 corresponds to R 11 , then those blocks are interleaved so that they are neighbors.
- the first two columns of the picture 410 are inserted directly into the interleaved picture, then the remaining columns of the picture 410 are column-interleaved with columns from the picture 420 . Finally, the last remaining columns of the picture 420 are inserted directly into the interleaved picture.
- the correspondence between blocks is not perfect. That is, the common content is not in the same relative position in each of the corresponding blocks. For example, the disparity is not equal to the horizontal size of the blocks. Nonetheless, coding gains are still achieved.
- blocks from the various input images are interleaved based on their relative locations in the input images. For example, the first column of the picture 410 is followed by the first column of the picture 420 . However, an individual block of the interleaved picture is intra-coded by searching within the interleaved picture to find a good reference. Such a search may identify the corresponding block without the implementation knowing the disparity prior to the search.
- FIG. 4 an implementation is shown in which the left picture 410 and the right picture 420 are assumed to have been downsampled in the vertical direction by a factor of 2, rather than in the horizontal direction as previously described in the discussion of FIG. 4 above. Further, the vertically-downsampled pictures 410 and 420 are then interleaved row-wise to form an interleaved picture 440 as shown by an arrow 435 .
- the left and right pictures are downsampled in a combination of horizontal and vertical directions to reduce their sizes by a combined factor of 2.
- various combinations of downsampling in the horizontal and vertical directions are possible in order to achieve a combined factor of 2 reduction.
- These downsampled pictures may then be interleaved in various manners known to those of ordinary skill in the art, including a combination of row-wise and column-wise interleaving.
- Another implementation does not downsample at all, and the pictures 410 and 420 are assumed to be in their original sizes.
- This implementation simply combines the left and right pictures using any of various interleaving options known in the art to produce a large interleaved picture.
- the H.264 encoder then encodes this large interleaved picture.
- an encoder and a decoder de-interleave the interleaved pictures 430 and 440 to form a reconstruction of a typical left/right horizontal split screen view, such as that provided by the horizontal split screen picture 174 of FIG. 1 .
- the encoder and decoder do not perform this operation. Rather, the encoder and decoder simply produce a reconstructed interleaved picture that still has the left and right views interleaved. The encoder uses this interleaved reconstruction to perform encoding of subsequent pictures.
- the encoder performs the search for an appropriate motion vector in the normal manner using the interleaved I picture. In this manner, the encoder may determine that a block corresponding to the either the left or right view is the best “match” for the current block being encoded in the P picture.
- Other implementations expand the search window used in finding the best “match” in the reference picture to account for the fact that the interleaving has spread the blocks of the component left and right pictures further apart in the interleaved reference picture.
- FIG. 5 there is shown an implementation for use in encoding two images.
- FIG. 5 depicts a process 500 for use in encoding two images, or portions thereof.
- the process 500 includes accessing a first-image block ( 510 ).
- the first image may be, for example, the sampled left view picture 148 of FIG. 1
- the first-image block may be, for example, the top left block from the sampled left view picture 148 .
- the process 500 includes accessing a second-image block that overlaps the first-image block in content ( 520 ).
- the second-image may be, for example, the sampled right view picture 150 of FIG. 1 . Both the sampled left view picture 148 and the sampled right view picture 150 are generated from a stereo-image pair, and so are assumed to overlap in content.
- the second-image block may be, for example, the top left block from the sampled right view picture 150 .
- the disparity for the top left block of the sampled left view picture 148 is greater than the horizontal block size of the top left block, it is possible that the content does not overlap the content of the top left block of the sampled right view picture 150 .
- content may overlap, for example, when the two blocks include a common feature, even if the feature is not aligned in the same relative location in each of the two blocks. Such overlap typically occurs in stereo-image pairs, as well as in the separate views of a multi-view system. Content may also overlap regardless of whether one of the images is flipped, rotated, filtered, or otherwise processed.
- the process 500 includes block interleaving a portion from the first image that includes the first-image block, and a portion from the second image that includes the second-image block ( 530 ).
- the two portions may include the entire first and second images. Alternatively, the two portions may include less than all of the first and second images.
- the block interleaving may be, for example, as described above for forming the composite picture 152 of FIG. 1 .
- the process 500 includes encoding the interleaved first-image block using the interleaved second-image block as a reference ( 540 ).
- the encoding may be performed, for example, as described above for using the encoder 154 of FIG. 1 to encode blocks from the composite picture 152 .
- encoding block G of FIG. 1B using block A as a predictor will result in a first-image block (block G) being encoded using a second-image block (block A) as a reference.
- FIG. 6 depicts a process 600 for use in decoding two images, or portions thereof.
- the process 600 includes accessing an encoding of an image ( 610 ).
- the image is an interleaved image in which two images have been interleaved on a block basis.
- the two images are a first image that includes multiple first-image blocks, and a second image that includes multiple second-image blocks.
- the encoding may be, for example, the received picture that is received and decoded by the decoding block 120 of FIG. 1 discussed above.
- the process 600 includes decoding a portion of the accessed encoding ( 620 ).
- the portion includes an encoding of a first-image block that has been encoded using a second-image block as a reference.
- the first-image block may be, as suggested above, the top left block from the sampled left view picture 148 .
- the second-image block may be, as suggested above, the top left block from the sampled right view picture 150 , which is assumed in this discussion to have overlapping content with the top left block from the sampled left view picture 148 .
- the decoding may be performed by, for example, the H.264 decoder 170 of FIG. 1 discussed above.
- an encoder 700 depicts an implementation of an encoder that may be used to encode images such as, for example, video images or depth images.
- the encoder 700 is used as the encoder 154 in the system 100 of FIG. 1 .
- the encoder 700 may also be used to encode data, such as, for example, metadata providing information about the encoded bitstream.
- the encoder 700 may be implemented as part of, for example, a video transmission system as described below with respect to FIG. 9 . It should also be clear that the blocks of FIG. 7 provide a flow diagram of an encoding process, in addition to providing a block diagram of an encoder.
- An input image sequence arrives at an adder 701 as well as at a displacement compensation block 720 and a displacement estimation block 718 .
- displacement refers, for example, to either motion or disparity.
- Another input to the adder 701 is one of a variety of possible reference picture information received through a switch 723 .
- a mode decision module 724 in signal communication with the switch 723 determines that the encoding mode should be intra-prediction with reference to a block from the same picture currently being encoded
- the adder 701 receives its input from an intra-prediction module 722 .
- the mode decision module 724 determines that the encoding mode should be displacement compensation and estimation with reference to a picture that is different from the picture currently being encoded
- the adder 701 receives its input from the displacement compensation module 720 .
- the intra-prediction module 722 provides a predetermined predictor based on one or more blocks that are neighboring blocks to a block being encoded.
- Such neighboring blocks may be interleaved blocks from another input image, such as, for example, a picture that forms a stereo-image pair with the picture being encoded.
- the interleaving is based on (x,y) coordinates, such that the blocks are interleaved in the order in which they appear in the constituent pictures.
- the interleaving is based on disparity, such that blocks that correspond in content are interleaved adjacent to each other to the extent possible, regardless of where those blocks are located in their constituent pictures.
- One particular implementation provides a practical use of this concept by coding a single value specifying the integer number of blocks of shift between the left and right pictures before interleaving. This allows an average disparity measurement at the encoder to guide the interleaving, and costs very little to code in the stream, and allows an easy descrambling of the blocks at the decoder prior to display.
- the intra-prediction module 722 provides a predictor (a reference) by searching within the picture being encoded for the best reference block. More specifically, several such implementations search within a reconstruction of those portions of the current picture that have already been encoded. In some implementations, the searching is restricted to blocks that lie on the existing block boundaries. However, in other implementations, the searching is allowed to search blocks regardless of whether those blocks cross existing block boundaries. Because of the searching, such implementations are often more time-intensive and processor-intensive than merely using predetermined neighboring blocks as the references. However, such implementations typically offer the advantage of finding a better prediction of a given block. Such implementations also typically offer the advantage of finding a corresponding stereo-image block, or corresponding multi-view-image block, without needing to know the disparity.
- Such implementations may lead to a best estimate Intra prediction block.
- the boundaries of the reference block can lie on a sub-pixel boundary, and recovery of the reference involves an interpolation step to restore the actual block to be used as reference during decoding.
- such sub-pixel interpolation implementations may improve compression efficiency compared to the use of neighboring blocks as references.
- the adder 701 provides a signal to a transform module 702 , which is configured to transform its input signal and provide the transformed signal to a quantization module 704 .
- the quantization module 704 is configured to perform quantization on its received signal and output the quantized information to an entropy encoder 705 .
- the entropy encoder 705 is configured to perform entropy encoding on its input signal to generate a bitstream.
- An inverse quantization module 706 is configured to receive the quantized signal from quantization module 704 and perform inverse quantization on the quantized signal.
- an inverse transform module 708 is configured to receive the inverse quantized signal from the inverse quantization module 706 and perform an inverse transform on its received signal.
- the output of the inverse transform module 708 is a reconstruction of the signal that is output from the adder 701 .
- An adder 709 adds (combines) signals received from the inverse transform module 708 and the switch 723 and outputs the resulting signal to the intra prediction module 722 and an in-loop filter 710 .
- the resulting signal is a reconstruction of the image sequence signal that is input to the encoder 700 .
- the intra prediction module 722 performs intra-prediction, as discussed above, using its received signals.
- the in-loop filter 710 filters the signals received from the adder 709 and provides filtered signals to a reference buffer 712 .
- the reference buffer 712 provides image information to the displacement estimation and compensation modules 718 and 720 .
- Metadata may be added to the encoder 700 as encoded metadata and combined with the output bitstream from the entropy coder 705 .
- unencoded metadata may be input to the entropy coder 705 for entropy encoding along with the quantized image sequences.
- the mode decision module 724 provides information to the bitstream that indicates the mode used to encode a given block. Such information often includes an indication of the location of the reference block. For example, in various implementations that use intra-prediction and that perform a search of the current picture to find a reference block, the mode decision module 724 indicates the location of the reference using a disparity vector. The disparity vector information may be provided to the mode decision module 724 by the intra prediction module 722 .
- the disparity vector information may be differentially coded using the disparity vector of a neighboring macroblock as a reference.
- disparity vectors for a picture may be grouped and additionally encoded to remove entropy since there is likely to be spatial similarity in disparity vectors.
- a decoder 800 depicts an implementation of a decoder that may be used to decode images and provide them to, for example, a display device.
- the decoder 800 may also be used to decode, for example, metadata providing information about the decoded bitstream.
- the decoder 800 is used as the decoder 156 and/or the decoder 170 in the system 100 of FIG. 1 .
- the decoder 800 may be implemented as part of, for example, a video receiving system as described below with respect to FIG. 10 . It should also be clear that the blocks of FIG. 8 provide a flow diagram of a decoding process, in addition to providing a block diagram of a decoder.
- the decoder 800 is configured to receive a bitstream using a bitstream receiver 802 .
- the bitstream receiver 802 is in signal communication with a bitstream parser 804 and provides the bitstream to the bitstream parser 804 .
- the bitstream parser 804 is configured to transmit a residue bitstream to an entropy decoder 806 , to transmit control syntax elements to a mode selection module 816 , and to transmit displacement (motion/disparity) vector information to a displacement compensation module 826 and to an intra prediction module 818 .
- the displacement vector information may be, for example, motion vector information or disparity vector information.
- Motion vector information is typically used in inter-prediction to indicate relative motion from a previous image.
- Disparity vector information is typically used in either (i) inter-prediction to indicate disparity with respect to a separate image or (ii) intra-prediction to indicate disparity with respect to a portion of the same image.
- disparity typically indicates the relative offset, or displacement, between two images.
- Disparity may also be used to indicate the relative offset, or displacement, between two portions of an image.
- An inverse quantization module 808 performs inverse quantization on an entropy decoded signal received from the entropy decoder 806 .
- an inverse transform module 810 is configured to perform an inverse transform on an inverse quantized signal received from the inverse quantization module 808 and to output the inverse transformed signal to an adder (also referred to as a combiner) 812 .
- the adder 812 can receive one of a variety of other signals depending on the decoding mode employed.
- the mode decision module 816 can determine whether displacement compensation or intra prediction encoding was performed on the currently processed block by the encoder by parsing and analyzing the control syntax elements.
- the mode selection control module 816 can access and control a switch 817 , based on the control syntax elements, so that the adder 812 can receive signals from the displacement compensation module 826 or the intra prediction module 818 .
- the intra prediction module 818 is configured to perform intra prediction to decode a block using references to the same picture currently being decoded.
- the displacement compensation module 826 is configured to perform displacement compensation to decode a block using references to a block of another previously processed picture that is different from the picture currently being decoded.
- the intra prediction module 818 of various implementations receives disparity vector information from the bitstream parser 804 identifying the location of the reference block used in intra-prediction.
- the block has typically been encoded in an intra-coding mode that searches the picture being coded to find a reference. This is in contrast, for example, to using one or more predetermined blocks from the picture being encoded to generate a predictor.
- the adder 812 After receiving prediction or compensation information signals, the adder 812 adds the prediction or compensation information signals with the inverse transformed signal for transmission to an in-loop filter 814 , such as, for example, a deblocking filter that filters out blocking artifacts.
- the adder 812 also outputs the added signal to the intra prediction module 818 for use in intra prediction.
- the in-loop filter 814 is configured to filter its input signal and output decoded pictures. Further, the in-loop filter 814 provides the filtered signal to a reference buffer 820 .
- the reference buffer 820 is configured to parse its received signal to permit and aid in displacement compensation decoding by the displacement compensation module 826 , to which the reference buffer 820 provides parsed signals. Such parsed signals may be, for example, all or part of various pictures that may have been used as a reference.
- Metadata may be included in a bitstream provided to the bitstream receiver 802 .
- the metadata may be parsed by the bitstream parser 804 , and decoded by the entropy decoder 806 .
- the decoded metadata may be extracted from the decoder 800 after the entropy decoding using an output (not shown).
- the video transmission system 900 may be, for example, a head-end or transmission system for transmitting a signal using any of a variety of media, such as, for example, satellite, cable, telephone-line, or terrestrial broadcast.
- the transmission may be provided over the Internet or some other network.
- the video transmission system 900 is capable of generating and delivering, for example, video content and other content such as, for example, indicators of depth including, for example, depth and/or disparity values.
- the blocks of FIG. 9 provide a flow diagram of a video transmission process, in addition to providing a block diagram of a video transmission system/apparatus.
- the video transmission system 900 receives input video from a processing device 901 .
- the processing device 901 simply provides original-sized images, such as the left view picture 140 and the right view picture 142 , to the video transmission system 900 .
- the processing device 901 is a processor configured for performing down-sampling and interleaving as described above for the system 100 with respect to the operations of the sampler 144 and the sampler 146 as well as the interleaving that results in the composite picture 152 .
- Various implementations of the processing device 901 include, for example, processing devices implementing the operations 510 , 520 , and 530 of the process 500 of FIG. 5 .
- the processing device 901 may also provide metadata to the video transmission system 900 indicating whether the input picture is interleaved and/or providing various parameters describing the interleaving.
- Such parameters include, for example, number of interleaved pictures, conversion rate for each picture, conversion type (for example, horizontal sampling or vertical sampling) for each picture, or interleaving mode (for example, row-wise interleaving or column-wise interleaving).
- the video transmission system 900 includes an encoder 902 and a transmitter 904 capable of transmitting the encoded signal.
- the encoder 902 receives video information from the processor 901 .
- the video information may include, for example, images and depth indicators.
- the encoder 902 generates an encoded signal(s) based on the video information.
- the encoder 902 may be, for example, the encoding block 110 , the encoder 154 , or the encoder 700 .
- the encoder 902 may include sub-modules, including for example an assembly unit for receiving and assembling various pieces of information into a structured format for storage or transmission.
- the various pieces of information may include, for example, coded or uncoded video, coded or uncoded depth indicators and/or information, and coded or uncoded elements such as, for example, motion vectors, coding mode indicators, and syntax elements.
- the encoder 902 includes the processor 901 and therefore performs the operations of the processor 901 .
- the transmitter 904 receives the encoded signal(s) from the encoder 902 and transmits the encoded signal(s) in one or more output bitstreams.
- the transmitter 904 may be, for example, adapted to transmit a program signal having one or more bitstreams representing encoded pictures and/or information related thereto.
- Typical transmitters perform functions such as, for example, one or more of providing error-correction coding, interleaving the data in the signal, randomizing the energy in the signal, and modulating the signal onto one or more carriers using a modulator 906 .
- the transmitter 904 may include, or interface with, an antenna (not shown). Further, implementations of the transmitter 904 may be limited to the modulator 906 .
- the video receiving system 1000 may be configured to receive signals over a variety of media, such as, for example, satellite, cable, telephone-line, or terrestrial broadcast. The signals may be received over the Internet or some other network. It should also be clear that the blocks of FIG. 10 provide a flow diagram of a video receiving process, in addition to providing a block diagram of a video receiving system/apparatus.
- the video receiving system 1000 may be, for example, a cell-phone, a computer, a set-top box, a television, or other device that receives encoded video and provides, for example, decoded video for display to a user, for processing, or for storage.
- the video receiving system 1000 may provide its output to, for example, a screen of a television, a computer monitor, a computer (for storage, processing, or display), or some other storage, processing, or display device.
- the video receiving system 1000 is capable of receiving and processing video content including video information.
- the video receiving system 1000 includes a receiver 1002 for receiving an encoded signal, such as for example the signals described in the implementations of this application.
- the receiver 1002 may receive, for example, a signal providing the received picture to the decoding block 120 of FIG. 1 , a signal carrying the bitstream from the encoder 700 of FIG. 7 , or a signal output from the video transmission system 900 of FIG. 9 .
- the receiver 1002 may be, for example, adapted to receive a program signal having a plurality of bitstreams representing encoded pictures. Typical receivers perform functions such as, for example, one or more of receiving a modulated and encoded data signal, demodulating the data signal from one or more carriers using a demodulator 1004 , de-randomizing the energy in the signal, de-interleaving the data in the signal, and error-correction decoding the signal.
- the receiver 1002 may include, or interface with, an antenna (not shown). Implementations of the receiver 1002 may be limited to the demodulator 1004 .
- the video receiving system 1000 includes a decoder 1006 .
- the receiver 1002 provides a received signal to the decoder 1006 .
- the decoder 1006 outputs a decoded signal, such as, for example, decoded video signals including video information.
- the decoder 1006 may be, for example, the decoder 156 or the decoder 170 of the system 100 of FIG. 1 , or the decoder 800 of FIG. 8 .
- the output video from the decoder 1006 is provided, in one implementation, to a processing device 1008 .
- the processing device 1008 is, in one implementation, a processor configured for performing deinterleaving and up-sampling as described above for the system 100 with respect to the deinterleaving that results in the horizontal split screen picture 174 as well as the operations of the sampler 180 and the sampler 182 .
- the decoder 1006 includes the processor 1008 and therefore performs the operations of the processor 1008 .
- the processor 1008 is part of a downstream device such as, for example, a set-top box or a television.
- images and/or “pictures”.
- image and/or “picture” are used interchangeably throughout this document, and are intended to be broad terms.
- An “image” or a “picture” may be, for example, all or part of a frame or of a field.
- video refers to a sequence of images (or pictures).
- An image, or a picture may include, for example, any of various video components or their combinations.
- Such components include, for example, luminance, chrominance, Y (of YUV or YCbCr or YPbPr), U (of YUV), V (of YUV), Cb (of YCbCr), Cr (of YCbCr), Pb (of YPbPr), Pr (of YPbPr), red (of RGB), green (of RGB), blue (of RGB), S-Video, and negatives or positives of any of these components.
- An “image” or a “picture” may also, or alternatively, refer to various different types of content, including, for example, typical two-dimensional video, a disparity map for a 2D video picture, a depth map that corresponds to a 2D video picture, or an edge map.
- Determining the information may include one or more of, for example, estimating the information, calculating the information, predicting the information, identifying the information, or retrieving the information from memory.
- any of the following “/”, “and/or”, and “at least one of”, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) without the second listed option (B), or the selection of the second listed option (B) without the first listed option (A), or the selection of both options (A and B).
- such phrasing is intended to encompass the selection of the first listed option (A) without the second (B) and third (C) listed options, or the selection of the second listed option (B) without the selection of the first (A) and third (C) listed options, or the selection of the third listed option (C) without the selection of the first (A) and second (B) listed options, or the selection of the first and the second listed options (A and B) without the selection of the third listed option (C), or the selection of the first and third listed options (A and C) without the selection of the second listed option (B), or the selection of the second and third listed options (B and C) without the selection of the first listed option (A), or the selection of all three options (A and B and C).
- the appearances of the phrase “in one embodiment” or “in an embodiment” or “in one implementation” or “in an implementation”, as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment. Additionally, these phrases (for example, “in one embodiment”) are not intended to indicate that there is only one possible embodiment but rather to draw attention to the fact that a particular embodiment is being discussed.
- the implementations described herein may be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed may also be implemented in other forms (for example, an apparatus or program).
- An apparatus may be implemented in, for example, appropriate hardware, software, and/or firmware.
- the methods may be implemented in, for example, an apparatus such as, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants (“PDAs”), and other devices that facilitate communication of information between end-users or devices.
- PDAs portable/personal digital assistants
- Implementations of the various processes and features described herein may be embodied in a variety of different equipment or applications, particularly, for example, equipment or applications associated with data encoding and decoding.
- equipment include an encoder, a decoder, a post-processor processing output from a decoder, a pre-processor providing input to an encoder, a video coder, a video decoder, a video codec, a web server, a set-top box, a laptop, a personal computer, a cell phone, a PDA, and other communication devices.
- the equipment may be mobile and even installed in a mobile vehicle.
- the methods may be implemented by instructions being performed by a processor, and such instructions (and/or data values produced by an implementation) may be stored on a processor-readable medium such as, for example, an integrated circuit, a software carrier or other storage device such as, for example, a hard disk, a compact diskette, a random access memory (“RAM”), or a read-only memory (“ROM”).
- the instructions may form an application program tangibly embodied on a processor-readable medium. Instructions may be, for example, in hardware, firmware, software, or a combination. Instructions may be found in, for example, an operating system, a separate application, or a combination of the two.
- a processor may be characterized, therefore, as, for example, both a device configured to carry out a process and a device that includes a processor-readable medium (such as a storage device) having instructions for carrying out a process. Further, a processor-readable medium may store, in addition to or in lieu of instructions, data values produced by an implementation.
- implementations may produce a variety of signals formatted to carry information that may be, for example, stored or transmitted.
- the information may include, for example, instructions for performing a method, or data produced by one of the described implementations.
- a signal may be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal.
- the formatting may include, for example, encoding a data stream and modulating a carrier with the encoded data stream.
- the information that the signal carries may be, for example, analog or digital information.
- the signal may be transmitted over a variety of different wired or wireless links, as is known.
- the signal may be stored on a processor-readable medium.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Theoretical Computer Science (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Description
- This application claims the benefit of the filing date of the following U.S. Provisional Application, which is hereby incorporated by reference in its entirety for all purposes: Ser. No. 61/337,060, filed on Jan. 29, 2010, and titled “Macroblock interleaving for improved 3D compression”.
- Implementations are described that relate to image compression. Various particular implementations relate to compression of interleaved images, and the interleaved images may be formed of images having overlapping content.
- Various techniques are known to compress images, including stereoscopic images and multi-view images. AVC, which refers to the existing International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “H.264/MPEG-4 AVC Standard” or variations thereof, such as the “AVC standard”, the “H.264 standard”, or simply “AVC” or “H.264”), may be used to compress such images individually. “I” frames are typically compressed in AVC using intra-coding techniques.
- According to a general aspect, a first image that includes multiple blocks is accessed. The multiple blocks of the first image include a first-image block. A second image that includes multiple blocks is accessed. The multiple blocks of the second image include a second-image block that has overlapping content with the first-image block. The multiple blocks of the first image and the multiple blocks of the second image are interleaved on a block basis to form an interleaved image. At least a portion of the interleaved image is encoded by encoding the first-image block using the second-image block as a reference. The encoded first-image block is provided for transmission or storage.
- According to another general aspect, a video signal or a video signal structure includes one or more picture portions for an encoding. The encoding is an encoding of a block-based interleaving of multiple blocks of a first image and multiple blocks of a second image. The multiple blocks of the first image include a first-image block, and the multiple blocks of the second image include a second-image block that has overlapping content with the first-image block. The encoding of the first-image block uses the second-image block as a reference.
- According to another general aspect, an encoded image is accessed. The encoded image is an encoding of a block-based interleaving of multiple blocks of a first image and multiple blocks of a second image. The multiple blocks of the first image include a first-image block, and the multiple blocks of the second image include a second-image block that has overlapping content with the first-image block. A portion of the encoded image is decoded. The encoded image portion encodes the first-image block using the second-image block as a reference. The decoded portion is provided for processing or display.
- The details of one or more implementations are set forth in the accompanying drawings and the description below. Even if described in one particular manner, it should be clear that implementations may be configured or embodied in various manners. For example, an implementation may be performed as a method, or embodied as an apparatus, such as, for example, an apparatus configured to perform a set of operations or an apparatus storing instructions for performing a set of operations, or embodied in a signal. Other aspects and features will become apparent from the following detailed description considered in conjunction with the accompanying drawings and the claims.
-
FIG. 1 is a block/flow diagram depicting an example of a system and process for encoding and decoding images that may be used with one or more implementations. -
FIG. 2 is a block diagram depicting examples of neighboring blocks that may be used with one or more implementations. -
FIG. 3 is a block diagram depicting examples of neighboring reference blocks that may be used with one or more implementations. -
FIG. 4 is a block/flow diagram depicting examples of vertical interleaving and horizontal interleaving that may be used with one or more implementations. -
FIG. 5 is a flow diagram depicting an example of an encoding process that may be used with one or more implementations. -
FIG. 6 is a flow diagram depicting an example of a decoding process that may be used with one or more implementations. -
FIG. 7 is a block/flow diagram depicting an example of an encoding system that may be used with one or more implementations. -
FIG. 8 is a block/flow diagram depicting an example of a decoding system that may be used with one or more implementations. -
FIG. 9 is a block/flow diagram depicting an example of a video transmission system that may be used with one or more implementations. -
FIG. 10 is a block/flow diagram depicting an example of a video receiving system that may be used with one or more implementations. - At least one implementation described in this application seeks to improve the efficiency of compressing a stereo image pairs that has been merged into a single image. The implementation rearranges the stereo image pair in a way that allows the H.264 compression algorithm to take better advantage of intra block prediction. The left view and right view pictures of the stereo image pair are interleaved at the macroblock level. The left view and right view pictures are encoded together as a single picture, and the interleaved picture arrangement typically improves intra prediction efficiency versus typical horizontal or vertical split screen arrangements.
- In block based compression algorithms (for example MPEG2, MPEG4), the inventors have determined that a disproportionate percentage of the total bit budget allocated to a compressed stream are spent on I-picture compression. Note that I pictures are often used as reference pictures. In the near term, broadcast 3D video is likely to rely on a split screen approach to deliver a left/right stereo image pair. A typical arrangement is a left and right picture, each horizontally sub-sampled by half, concatenated to form a single full size composite left+right picture.
- Horizontal sub-sampling and vertical sub-sampling are both used in current generation half resolution 3D encoders. Typically, horizontal sub-sampling is used for 1920×1080 source material, and vertical sub-sampling is used for 1280×720p source material.
- The advantage of these sub-sampling approaches is that the composite picture can be encoded and decoded by legacy equipment with the display device responsible for separating the left and right images. While convenient, this approach does not take good advantage of the redundancy between the left and right images. By rearranging the left and right images in a way that allows the compression algorithm to take better advantage of this redundancy, the resulting compressed image stream can still remain largely compatible with legacy encode/decode tools while increasing the compression efficiency of the coded I (or reference) pictures.
- The above approach can be used as an alternative to MVC (multi-view coding). Although alternatives, the above approach and MVC are not necessarily equivalent in that the two approaches may produce different results. MVC refers more specifically to a multi-view video coding (“MVC”) extension (Annex H) of the AVC standard, referred to as H.264/MPEG-4 AVC, MVC extension (the “MVC extension” or simply “MVC”). MVC is non-backward compatible compression algorithm, which is an extension of the H.264/MPEG-4 AVC standard that has been developed to take advantage of, for example, the redundancy between left and right views in a stereo image pair.
- Referring to
FIG. 1 , asystem 100 is shown that provides an implementation for processing intra coded pictures (that is, I pictures). Intra coded pictures follow the process illustrated inFIG. 1 , and described below. Thesystem 100 includes anencoding block 110, adecoding block 120, and atransmission operation 130 that links theencoding block 110 and thedecoding block 120. - Full resolution input pictures for a stereo-image pair are provided as input to the
encoding block 110. The full resolution stereo images include aleft view picture 140 and aright view picture 142. The full resolution images are down-sampled in the horizontal dimension by ½ to reduce the original horizontal size by ½. This results in a horizontal sample rate conversion (“SRC”) to ½ the original horizontal size. Down-sampling is also referred to as sub-sampling, rate converting, or down-scaling. Theencoding block 110 includes asampler 144 that down-samples theleft view picture 140, and asampler 146 that down-samples theright view picture 142. Thesampler 144 produces a sampledleft view picture 148 that is ½ the size of theleft view picture 140 in the horizontal dimension. Similarly, thesampler 146 produces a sampledright view picture 150 that is ½ the size of theright view picture 142 in the horizontal dimension. - The sampled left
view picture 148 and the sampledright view picture 150 are interleaved to form an interleavedcomposite picture 152. Thecomposite picture 152 is formed by decomposing (also referred to as partitioning or dividing) the sampled leftview picture 148 into 16×16 macroblocks, decomposing the sampledright view picture 150 into 16×16 macroblocks, and interleaving the macroblocks from theleft view picture 148 and theright view picture 150 to form thecomposite picture 152. - In the implementation shown in
FIG. 1 , the macroblocks are interleaved on an alternating basis in a column-by-column format, as explained further with respect toFIG. 4 below. This results in acomposite picture 152 that has the same vertical dimension as the sampled leftview picture 148 and the sampledright view picture 150, and twice the horizontal dimension of either of the sampled leftview picture 148 and the sampledright view picture 150. - The
encoding block 110 includes an H.264encoder 154 that encodes thecomposite picture 152. In theencoding block 110, thecomposite picture 152 is encoded using HP@L4.x to form an encoded picture (not shown). HP@L4.x refers to High Profile, level 4.x, which includes 4.0, 4.1, and 4.2. However, other implementations use any of several of the H.264 coding profiles, such as, for example, all levels of the Baseline Profile, all levels of the Main Profile, and all levels of the High Profile. - The
encoder 154 encodes thecomposite picture 152 as an I picture, and uses H.264 intra-coding modes. Accordingly, the blocks of thecomposite picture 152 are encoded using one or more other blocks from thecomposite picture 152 as a reference. For example, a predictor for a given block may be formed from a combination of neighboring blocks. A neighboring block of a given block is commonly defined to be one of the eight blocks that touches the given block on a corner or edge. Referring toFIG. 2 , the eight neighboring blocks of a middle block M are shown as blocks 1-8. Note that for purposes of H.264 Intra prediction modes, blocks 1,2,3,4 and 6 are generally allowed as predictors. - Referring to
FIG. 3 , various implementations form the predictor for a given block (block G) from a combination of blocks lying horizontally to the left (block A), vertically above (block B), and diagonally to the right and above (block C). Because thecomposite picture 152 uses column-by-column interleaving, it should be clear that block G will be from one of either the sampled leftview picture 148 or the sampledright view picture 150, and that blocks A and C will both be from the other sampled picture. Various implementations may form a predictor based on only one (rather than a combination) of blocks A, B, or C, or on other blocks including non-neighboring blocks. In particular, various implementations provide encoding modes that allow block G to be encoded with respect to block A alone, or with respect to block C alone. - Such modes, that code block G using only block A or block C, are expected to have increased coding efficiency by using the interleaved
composite picture 152, as compared to using a split screen format that is not interleaved (see horizontalsplit screen picture 160 described below). The increased efficiency is expected to arise, at least in part, from being able to encode a block from one view (left or right) using a corresponding block from the other view. If the corresponding blocks are aligned well, then the residue will be small and will require fewer bits to encode. It is noted, however, that the alignment need not be perfect to reduce the residue and provide coding gains. - As noted above, blocks 1,2,3,4, and 6 as shown in
FIG. 2 may be used as predictors for block M in H.264 Intra prediction. Various implementations, however, perform interleaving to take advantage of the fact that in a stereoscopic view there is expected to be horizontal displacement in the two pictures but not vertical displacement. The best predictor in such cases is expected to be to the corresponding block from the other stereoscopic view. That corresponding block will often be to the left of the block being coded after column-wise interleaving, and will often be above the block being coded after row-wise interleaving. - Various implementations perform intra-coding of the
composite picture 152 by searching within thecomposite picture 152 for the best reference block. More specifically, several such implementations search within a reconstruction of those portions of the current picture that have already been encoded. Because of the searching, such a mode is often more time-intensive and processor-intensive than merely using predetermined neighboring blocks as the references. However, such a mode typically offers the advantage of finding a better prediction of a given block. Such a mode also typically offers the advantage of finding a corresponding stereo-image block without needing to know the disparity. - The
encoding block 110 includes an H.264decoder 156 that decodes the encoded picture to produce a decodedpicture 158. In the implementation of theencoding block 110, the encoded picture is decoded using HP@L4.x. The decodedpicture 158 is a reconstruction of thecomposite picture 152. - The
encoding block 110 deinterleaves the decodedpicture 158 to form a horizontalsplit screen picture 160. The horizontalsplit screen picture 160 includes aleft picture reconstruction 162 of the sampled leftview picture 148, and includes aright picture reconstruction 164 of the sampledright view picture 150. The horizontalsplit screen picture 160 is stored as a reference picture in a reference picture storage (not shown), and is available to be used as a reference picture by theencoding block 110. - P and B pictures are coded as horizontal split screen pictures. That is, for P and B pictures, the sampled left
view picture 148 and the sampledright view picture 150 are formed into a horizontal split screen picture rather than an interleaved composite picture, and encoded by theencoder 154. Reference pictures are also stored as horizontal split screen pictures, as indicated above. When P or B coded blocks contain motion references that point to the I picture, the motion estimation is extracted from the horizontal split screen reconstructedpicture 160. - The
encoding block 110 thus performs different operations for I blocks, as compared to P and B blocks. For example, for I blocks theencoding block 110 performs (i) interleaving before encoding and (ii) deinterleaving before forming a horizontal split screen reconstruction picture. As another example, for P and B blocks, theencoding block 110 forms a split screen picture before encoding. - The
encoder 154 also provides the encoded picture (not shown) to thetransmission operation 130 for transmission. The transmitted picture is received by thedecoding block 120. - The
decoding block 120 includes an H.264decoder 170 that performs an HP@L4.x decode of the received picture. Thedecoder 170 produces areconstructed picture 172 that is a reconstruction of thecomposite picture 152. Accordingly, thereconstructed picture 172 has macroblocks interleaved from a left image (the sampled left view picture 148) and a right image (the sampled right view picture 150). In a typical implementation, thedecoder 170 will be the same as thedecoder 156. - The
decoding block 120 deinterleaves thereconstructed picture 172 to form a horizontalsplit screen picture 174 that includes aleft picture reconstruction 176 and aright picture reconstruction 178. If there are no errors in transmission or decoding, (i) thereconstructed picture 172 will match the decodedpicture 158 from theencoding block 110, (ii) the horizontalsplit screen picture 174 will match the horizontalsplit screen picture 160, (iii) theleft picture reconstruction 176 will match theleft picture reconstruction 162, and (iv) theright picture reconstruction 178 will match theright picture reconstruction 164. - The
decoding block 120 includes asampler 180 that performs horizontal sample rate conversion to recover the original horizontal size. Thesampler 180 performs the conversion by upsampling theleft picture reconstruction 176 to recover the original horizontal size of theleft view picture 140. Thesampler 180 produces a reconstructedleft view picture 184 which is a reconstruction of theleft view picture 140. Upsampling is also referred to as rate converting or up-scaling. - Similarly, the
decoding block 120 includes asampler 182 that performs horizontal sample rate conversion to recover the original horizontal size. Thesampler 182 performs the conversion by upsampling theright picture reconstruction 178 to recover the original horizontal size of theright view picture 142. Thesampler 182 produces a reconstructedright view picture 186 which is a reconstruction of theright view picture 142. - The reconstructed left
view picture 184, and the reconstructedright view picture 186 are full resolution pictures ready for output to display. Other implementations also, or alternatively, provide the reconstructedleft view picture 184 and/or the reconstructedright view picture 186 for processing. Such processing includes, for example, filtering, rendering further images, artifact reduction, color modification, edge sharpening, and/or object detection, and may be performed prior to display or in lieu of display. Additionally, other implementations provide the horizontalsplit screen picture 174 as output for processing and/or display. - As with the
encoding block 110, thedecoding block 120 also performs different operations for I blocks, as compared to P and B blocks. For example, for I blocks thedecoding block 120 performs deinterleaving before forming the horizontalsplit screen picture 174. In contrast, for P and B blocks, the output of thedecoder 170 will be a horizontal split screen picture. - The process of
FIG. 1 is at least largely backward compatible with existing processes. Additionally, legacy H.264 encoders and decoders may be used. However, the process ofFIG. 1 may not be completely backward compatible with all existing decode processes. Nonetheless, it is within the capability of many decoders to use integrated Blit (for example, a programmable bitmap graphics device; or a bit blit device performing bit-block image transfers, for example, to combine multiple bitmaps) or DMA capability to convert the macroblock interleaved I picture image into a left/right split screen image. That is, an existing H.264 decoder might not be configured to convert the decoded (interleaved picture) 158 into the horizontalsplit screen picture 160, or to convert thereconstructed picture 172 into the horizontalsplit screen picture 174. However, techniques for performing this conversion are viable and well within the ordinary skill in the art using, for example, technologies such as integrated Blit or DMA. Additionally, such technologies can be used to selectively create either an interleaved image (for example, the composite picture 152) or a split screen concatenated image to be used as input to an H.264 encoder. - Other implementations modify various aspects of the
system 100 described above. Certain implementations and modifications are described below, but other modifications are contemplated as well. -
- For example, the two input images need not form a stereo-image pair. In various implementations, the input images are images from a multi-view system.
- Additionally, the input images need not be downsampled by exactly ½, and need not be downsampled at all. In various implementations, the input images (i) remain at their original sampling rate, (ii) are downsampled by values other than ½, or (iii) are upsampled.
- Further, the input images need not be sampled at the same rate. In various implementations, a first input image is sampled at a first rate and a second input image is sampled at a second rate that is different from the first rate.
- Implementations may use more than two input images. Various implementations use three or more input images, and interleave all of the input images. One such implementation interleaves three or more input views from a multi-view system. Another such implementation interleaves four images that include a first stereo image pair taken from a stereo camera at a first instant of time and a second stereo image pair taken from the stereo camera at a second instant of time.
- Various implementations process the input images in addition to, or in lieu of, sampling the input images. Processing performed by various implementations includes, for example, filtering the pixel values of the images, clipping the pixel values of the images, adding blocks to the images around the image borders, or removing blocks that do not have overlapping content.
- The blocks used for interleaving need not be 16×16, nor even macroblocks. Various implementations use blocks having a size different from 16×16 and/or use a block size different from the size of macroblocks used in encoding. Various implementations also vary the block size or use a selectable block size. The H.264 standard allows intra prediction for 4×4 blocks, 8×8 blocks, and 16×16 macroblocks. An above implementation illustrates and describes the concept using macroblocks, but other implementations implement the interleaving at the block level, including, for example, a 4×4 block level, an 8×8 block level, and a variable level that uses both 4×4 blocks and 8×8 blocks.
- The interleaved image need not be encoded using HP@L4.x, nor even H.264. Various implementations use different H.264 profiles or different coding schemes. For example, for H.264 all levels of the High Profile, all levels of the Main Profile, and all levels of the Baseline Profile may be used, and various implementations are directed to each of these levels and Profiles.
- The encoded interleaved image provided by the
encoding block 110 need not be transmitted. Various implementations store the encoded image, for example. - The reference images need not be horizontal split screen images, or even split screen images at all. Various implementations use, for example, vertical split screen images as references, or interleaved images as references, or the individual images as references.
- P and B pictures need not be coded as horizontal split screen pictures. Various implementations perform interleaving of P and/or B stereoscopic image pairs, as is done above for I pictures. One or more of these implementations codes the interleaved P and/or B pictures using inter-coding with respect to other pictures used as references. The references for several such implementations are also interleaved pictures, but for other implementations the references are not interleaved. Additionally, some of these implementations consider both inter-prediction modes and intra-prediction modes for coding a given block in the interleaved P or B picture. As such, some of these implementations perform an optimal encoding of the given block from the interleaved P or B picture.
- Referring to
FIG. 4 , there is shown a more detailed view of two implementations of macroblock level interleaving.FIG. 4 depicts aleft picture 410 and aright picture 420 of a stereo-image pair. In this implementation, theleft picture 410 and theright picture 420 are assumed to have been downsampled in the horizontal direction by a factor of 2. These two 410 and 420 are combined, as shown bypictures arrow 425, to form an interleavedpicture 430. The interleavedpicture 430 is effectively a column-wise interleaving of the 410 and 420.pictures - For this implementation, it is assumed that an encoder encodes the interleaved
picture 430 row-by-row, from left to right. Accordingly, as the encoder is encoding the interleavedpicture 430, it can be seen that when the encoder gets to the block labeled R22 (circled in the interleaved picture 430), the encoder has already encoded the corresponding block L22 (also circled in the interleaved picture 430) from the left picture, and has the encoding of L22 available to use in encoding R22. L22 is to the immediate left of R22 in the interleavedpicture 430. - L22 and R22 correspond as corresponding blocks in a stereo-image pair, and so their content is assumed to overlap considerably. Content overlaps when both blocks have some common content. Blocks share common content when, for example, both blocks include a particular object or background, even if that object or background is not in exactly the same relative position in each of the blocks.
- Identification of these corresponding blocks is based simply on the fact that L22 and R22 have corresponding locations in the two
410 and 420. That is, L22 and R22 are assumed to have the same (x,y) coordinates in theirpictures 410 and 420.respective pictures - Other implementations determine corresponding blocks based on, for example, disparity. For such disparity-based implementations, a variety of disparity-based metrics may be used, such as, for example, the average disparity for the stereo-image pair. In one such implementation, the average disparity of the
picture 410 is determined to be equal to the horizontal size of a single block. Accordingly, the block L12 of thepicture 410 is determined to correspond to the block R11 of thepicture 420. Note that in such an implementation, the interleaving may still be performed as in the interleavedpicture 430, or the interleaving may be based on the disparity. - In one disparity-based implementation, the blocks are interleaved as in the interleaved
picture 430. However, corresponding blocks may or may not be neighbors. In the example in which L12 corresponds to R11, those blocks would still be neighbors, as shown in the interleavedpicture 430. However, if the disparity were equal to twice the horizontal size of the blocks, then L13 would correspond to R11, and those blocks would not be neighbors in the interleavedpicture 430. - In another disparity-based implementation, the blocks are interleaved based on the disparity. Therefore, if L13 corresponds to R11, then those blocks are interleaved so that they are neighbors. In one such implementation, the first two columns of the
picture 410 are inserted directly into the interleaved picture, then the remaining columns of thepicture 410 are column-interleaved with columns from thepicture 420. Finally, the last remaining columns of thepicture 420 are inserted directly into the interleaved picture. - In various implementations, the correspondence between blocks is not perfect. That is, the common content is not in the same relative position in each of the corresponding blocks. For example, the disparity is not equal to the horizontal size of the blocks. Nonetheless, coding gains are still achieved.
- In other disparity-based implementations, blocks from the various input images are interleaved based on their relative locations in the input images. For example, the first column of the
picture 410 is followed by the first column of thepicture 420. However, an individual block of the interleaved picture is intra-coded by searching within the interleaved picture to find a good reference. Such a search may identify the corresponding block without the implementation knowing the disparity prior to the search. - Other downsampling and interleaving options are possible. Referring still to
FIG. 4 , an implementation is shown in which theleft picture 410 and theright picture 420 are assumed to have been downsampled in the vertical direction by a factor of 2, rather than in the horizontal direction as previously described in the discussion ofFIG. 4 above. Further, the vertically- 410 and 420 are then interleaved row-wise to form an interleaveddownsampled pictures picture 440 as shown by anarrow 435. As with the encoding of the interleavedpicture 430, it can be seen that when the encoder gets to the block labeled R22 (circled in the interleaved picture 440), the encoder has already encoded the corresponding block L22 (also circled in the interleaved picture 440) from the left picture, and has the encoding of L22 available to use in encoding R22. L22 is immediately above R22 in the interleavedpicture 440. - In yet another implementation, the left and right pictures are downsampled in a combination of horizontal and vertical directions to reduce their sizes by a combined factor of 2. As will be appreciated by those of ordinary skill in the art, various combinations of downsampling in the horizontal and vertical directions are possible in order to achieve a combined factor of 2 reduction. These downsampled pictures may then be interleaved in various manners known to those of ordinary skill in the art, including a combination of row-wise and column-wise interleaving.
- Another implementation does not downsample at all, and the
410 and 420 are assumed to be in their original sizes. This implementation simply combines the left and right pictures using any of various interleaving options known in the art to produce a large interleaved picture. The H.264 encoder then encodes this large interleaved picture.pictures - In a typical implementation of either interleaving option of
FIG. 4 , an encoder and a decoder de-interleave the interleaved 430 and 440 to form a reconstruction of a typical left/right horizontal split screen view, such as that provided by the horizontalpictures split screen picture 174 ofFIG. 1 . However, in other implementations, the encoder and decoder do not perform this operation. Rather, the encoder and decoder simply produce a reconstructed interleaved picture that still has the left and right views interleaved. The encoder uses this interleaved reconstruction to perform encoding of subsequent pictures. For example, if a P picture is to be motion encoded using an interleaved I picture as a reference, the encoder performs the search for an appropriate motion vector in the normal manner using the interleaved I picture. In this manner, the encoder may determine that a block corresponding to the either the left or right view is the best “match” for the current block being encoded in the P picture. Other implementations expand the search window used in finding the best “match” in the reference picture to account for the fact that the interleaving has spread the blocks of the component left and right pictures further apart in the interleaved reference picture. - Referring to
FIG. 5 , there is shown an implementation for use in encoding two images.FIG. 5 depicts aprocess 500 for use in encoding two images, or portions thereof. - The
process 500 includes accessing a first-image block (510). The first image may be, for example, the sampled leftview picture 148 ofFIG. 1 , and the first-image block may be, for example, the top left block from the sampled leftview picture 148. - The
process 500 includes accessing a second-image block that overlaps the first-image block in content (520). The second-image may be, for example, the sampledright view picture 150 ofFIG. 1 . Both the sampled leftview picture 148 and the sampledright view picture 150 are generated from a stereo-image pair, and so are assumed to overlap in content. The second-image block may be, for example, the top left block from the sampledright view picture 150. - If the disparity for the top left block of the sampled left
view picture 148 is greater than the horizontal block size of the top left block, it is possible that the content does not overlap the content of the top left block of the sampledright view picture 150. As mentioned above, content may overlap, for example, when the two blocks include a common feature, even if the feature is not aligned in the same relative location in each of the two blocks. Such overlap typically occurs in stereo-image pairs, as well as in the separate views of a multi-view system. Content may also overlap regardless of whether one of the images is flipped, rotated, filtered, or otherwise processed. - The
process 500 includes block interleaving a portion from the first image that includes the first-image block, and a portion from the second image that includes the second-image block (530). The two portions may include the entire first and second images. Alternatively, the two portions may include less than all of the first and second images. The block interleaving may be, for example, as described above for forming thecomposite picture 152 ofFIG. 1 . - The
process 500 includes encoding the interleaved first-image block using the interleaved second-image block as a reference (540). The encoding may be performed, for example, as described above for using theencoder 154 ofFIG. 1 to encode blocks from thecomposite picture 152. For example, assuming column-wise interleaving, encoding block G ofFIG. 1B using block A as a predictor (that is, as a reference) will result in a first-image block (block G) being encoded using a second-image block (block A) as a reference. - Referring to
FIG. 6 , there is shown an implementation for use in decoding two images.FIG. 6 depicts aprocess 600 for use in decoding two images, or portions thereof. - The
process 600 includes accessing an encoding of an image (610). The image is an interleaved image in which two images have been interleaved on a block basis. The two images are a first image that includes multiple first-image blocks, and a second image that includes multiple second-image blocks. The encoding may be, for example, the received picture that is received and decoded by thedecoding block 120 ofFIG. 1 discussed above. - The
process 600 includes decoding a portion of the accessed encoding (620). The portion includes an encoding of a first-image block that has been encoded using a second-image block as a reference. The first-image block may be, as suggested above, the top left block from the sampled leftview picture 148. The second-image block may be, as suggested above, the top left block from the sampledright view picture 150, which is assumed in this discussion to have overlapping content with the top left block from the sampled leftview picture 148. The decoding may be performed by, for example, the H.264decoder 170 ofFIG. 1 discussed above. - Referring to
FIG. 7 , anencoder 700 depicts an implementation of an encoder that may be used to encode images such as, for example, video images or depth images. In one implementation, theencoder 700 is used as theencoder 154 in thesystem 100 ofFIG. 1 . Theencoder 700 may also be used to encode data, such as, for example, metadata providing information about the encoded bitstream. Theencoder 700 may be implemented as part of, for example, a video transmission system as described below with respect toFIG. 9 . It should also be clear that the blocks ofFIG. 7 provide a flow diagram of an encoding process, in addition to providing a block diagram of an encoder. - An input image sequence arrives at an
adder 701 as well as at adisplacement compensation block 720 and adisplacement estimation block 718. Note that displacement refers, for example, to either motion or disparity. Another input to theadder 701 is one of a variety of possible reference picture information received through aswitch 723. - For example, if a
mode decision module 724 in signal communication with theswitch 723 determines that the encoding mode should be intra-prediction with reference to a block from the same picture currently being encoded, then theadder 701 receives its input from anintra-prediction module 722. Alternatively, if themode decision module 724 determines that the encoding mode should be displacement compensation and estimation with reference to a picture that is different from the picture currently being encoded, then theadder 701 receives its input from thedisplacement compensation module 720. - In various implementations, the
intra-prediction module 722 provides a predetermined predictor based on one or more blocks that are neighboring blocks to a block being encoded. Such neighboring blocks may be interleaved blocks from another input image, such as, for example, a picture that forms a stereo-image pair with the picture being encoded. In various implementations, the interleaving is based on (x,y) coordinates, such that the blocks are interleaved in the order in which they appear in the constituent pictures. However, in other implementations the interleaving is based on disparity, such that blocks that correspond in content are interleaved adjacent to each other to the extent possible, regardless of where those blocks are located in their constituent pictures. - One particular implementation provides a practical use of this concept by coding a single value specifying the integer number of blocks of shift between the left and right pictures before interleaving. This allows an average disparity measurement at the encoder to guide the interleaving, and costs very little to code in the stream, and allows an easy descrambling of the blocks at the decoder prior to display.
- In various implementations, the
intra-prediction module 722 provides a predictor (a reference) by searching within the picture being encoded for the best reference block. More specifically, several such implementations search within a reconstruction of those portions of the current picture that have already been encoded. In some implementations, the searching is restricted to blocks that lie on the existing block boundaries. However, in other implementations, the searching is allowed to search blocks regardless of whether those blocks cross existing block boundaries. Because of the searching, such implementations are often more time-intensive and processor-intensive than merely using predetermined neighboring blocks as the references. However, such implementations typically offer the advantage of finding a better prediction of a given block. Such implementations also typically offer the advantage of finding a corresponding stereo-image block, or corresponding multi-view-image block, without needing to know the disparity. - Such implementations may lead to a best estimate Intra prediction block. Additionally, in various implementations, the boundaries of the reference block can lie on a sub-pixel boundary, and recovery of the reference involves an interpolation step to restore the actual block to be used as reference during decoding. Depending on the content of the pictures, such sub-pixel interpolation implementations may improve compression efficiency compared to the use of neighboring blocks as references.
- The
adder 701 provides a signal to atransform module 702, which is configured to transform its input signal and provide the transformed signal to aquantization module 704. Thequantization module 704 is configured to perform quantization on its received signal and output the quantized information to anentropy encoder 705. Theentropy encoder 705 is configured to perform entropy encoding on its input signal to generate a bitstream. Aninverse quantization module 706 is configured to receive the quantized signal fromquantization module 704 and perform inverse quantization on the quantized signal. In turn, aninverse transform module 708 is configured to receive the inverse quantized signal from theinverse quantization module 706 and perform an inverse transform on its received signal. The output of theinverse transform module 708 is a reconstruction of the signal that is output from theadder 701. - An adder (more generally referred to as a combiner) 709 adds (combines) signals received from the
inverse transform module 708 and theswitch 723 and outputs the resulting signal to theintra prediction module 722 and an in-loop filter 710. The resulting signal is a reconstruction of the image sequence signal that is input to theencoder 700. - The
intra prediction module 722 performs intra-prediction, as discussed above, using its received signals. Similarly, the in-loop filter 710 filters the signals received from theadder 709 and provides filtered signals to areference buffer 712. Thereference buffer 712 provides image information to the displacement estimation and 718 and 720.compensation modules - Metadata may be added to the
encoder 700 as encoded metadata and combined with the output bitstream from theentropy coder 705. Alternatively, for example, unencoded metadata may be input to theentropy coder 705 for entropy encoding along with the quantized image sequences. - Data is also provided to the output bitstream by the
mode decision module 724. Themode decision module 724 provides information to the bitstream that indicates the mode used to encode a given block. Such information often includes an indication of the location of the reference block. For example, in various implementations that use intra-prediction and that perform a search of the current picture to find a reference block, themode decision module 724 indicates the location of the reference using a disparity vector. The disparity vector information may be provided to themode decision module 724 by theintra prediction module 722. - As further described below, the disparity vector information may be differentially coded using the disparity vector of a neighboring macroblock as a reference. In addition, disparity vectors for a picture may be grouped and additionally encoded to remove entropy since there is likely to be spatial similarity in disparity vectors.
- Referring to
FIG. 8 , adecoder 800 depicts an implementation of a decoder that may be used to decode images and provide them to, for example, a display device. Thedecoder 800 may also be used to decode, for example, metadata providing information about the decoded bitstream. In one implementation, thedecoder 800 is used as thedecoder 156 and/or thedecoder 170 in thesystem 100 ofFIG. 1 . Further, thedecoder 800 may be implemented as part of, for example, a video receiving system as described below with respect toFIG. 10 . It should also be clear that the blocks ofFIG. 8 provide a flow diagram of a decoding process, in addition to providing a block diagram of a decoder. - The
decoder 800 is configured to receive a bitstream using abitstream receiver 802. Thebitstream receiver 802 is in signal communication with abitstream parser 804 and provides the bitstream to thebitstream parser 804. - The
bitstream parser 804 is configured to transmit a residue bitstream to anentropy decoder 806, to transmit control syntax elements to amode selection module 816, and to transmit displacement (motion/disparity) vector information to adisplacement compensation module 826 and to anintra prediction module 818. - The displacement vector information may be, for example, motion vector information or disparity vector information. Motion vector information is typically used in inter-prediction to indicate relative motion from a previous image. Disparity vector information is typically used in either (i) inter-prediction to indicate disparity with respect to a separate image or (ii) intra-prediction to indicate disparity with respect to a portion of the same image. As is known in the art, disparity typically indicates the relative offset, or displacement, between two images. Disparity may also be used to indicate the relative offset, or displacement, between two portions of an image.
- An
inverse quantization module 808 performs inverse quantization on an entropy decoded signal received from theentropy decoder 806. In addition, aninverse transform module 810 is configured to perform an inverse transform on an inverse quantized signal received from theinverse quantization module 808 and to output the inverse transformed signal to an adder (also referred to as a combiner) 812. - The
adder 812 can receive one of a variety of other signals depending on the decoding mode employed. For example, themode decision module 816 can determine whether displacement compensation or intra prediction encoding was performed on the currently processed block by the encoder by parsing and analyzing the control syntax elements. Depending on the determined mode, the modeselection control module 816 can access and control aswitch 817, based on the control syntax elements, so that theadder 812 can receive signals from thedisplacement compensation module 826 or theintra prediction module 818. - Here, the
intra prediction module 818 is configured to perform intra prediction to decode a block using references to the same picture currently being decoded. In turn, thedisplacement compensation module 826 is configured to perform displacement compensation to decode a block using references to a block of another previously processed picture that is different from the picture currently being decoded. - Additionally, the
intra prediction module 818 of various implementations receives disparity vector information from thebitstream parser 804 identifying the location of the reference block used in intra-prediction. In such implementations, the block has typically been encoded in an intra-coding mode that searches the picture being coded to find a reference. This is in contrast, for example, to using one or more predetermined blocks from the picture being encoded to generate a predictor. - After receiving prediction or compensation information signals, the
adder 812 adds the prediction or compensation information signals with the inverse transformed signal for transmission to an in-loop filter 814, such as, for example, a deblocking filter that filters out blocking artifacts. Theadder 812 also outputs the added signal to theintra prediction module 818 for use in intra prediction. - The in-
loop filter 814 is configured to filter its input signal and output decoded pictures. Further, the in-loop filter 814 provides the filtered signal to areference buffer 820. Thereference buffer 820 is configured to parse its received signal to permit and aid in displacement compensation decoding by thedisplacement compensation module 826, to which thereference buffer 820 provides parsed signals. Such parsed signals may be, for example, all or part of various pictures that may have been used as a reference. - Metadata may be included in a bitstream provided to the
bitstream receiver 802. The metadata may be parsed by thebitstream parser 804, and decoded by theentropy decoder 806. The decoded metadata may be extracted from thedecoder 800 after the entropy decoding using an output (not shown). - Referring now to
FIG. 9 , a video transmission system/apparatus 900 is shown, to which the features and principles described above may be applied. Thevideo transmission system 900 may be, for example, a head-end or transmission system for transmitting a signal using any of a variety of media, such as, for example, satellite, cable, telephone-line, or terrestrial broadcast. The transmission may be provided over the Internet or some other network. Thevideo transmission system 900 is capable of generating and delivering, for example, video content and other content such as, for example, indicators of depth including, for example, depth and/or disparity values. It should also be clear that the blocks ofFIG. 9 provide a flow diagram of a video transmission process, in addition to providing a block diagram of a video transmission system/apparatus. - The
video transmission system 900 receives input video from aprocessing device 901. In one implementation, theprocessing device 901 simply provides original-sized images, such as theleft view picture 140 and theright view picture 142, to thevideo transmission system 900. However, in another implementation, theprocessing device 901 is a processor configured for performing down-sampling and interleaving as described above for thesystem 100 with respect to the operations of thesampler 144 and thesampler 146 as well as the interleaving that results in thecomposite picture 152. Various implementations of theprocessing device 901 include, for example, processing devices implementing the 510, 520, and 530 of theoperations process 500 ofFIG. 5 . Theprocessing device 901 may also provide metadata to thevideo transmission system 900 indicating whether the input picture is interleaved and/or providing various parameters describing the interleaving. Such parameters include, for example, number of interleaved pictures, conversion rate for each picture, conversion type (for example, horizontal sampling or vertical sampling) for each picture, or interleaving mode (for example, row-wise interleaving or column-wise interleaving). - The
video transmission system 900 includes anencoder 902 and atransmitter 904 capable of transmitting the encoded signal. Theencoder 902 receives video information from theprocessor 901. The video information may include, for example, images and depth indicators. Theencoder 902 generates an encoded signal(s) based on the video information. Theencoder 902 may be, for example, theencoding block 110, theencoder 154, or theencoder 700. Theencoder 902 may include sub-modules, including for example an assembly unit for receiving and assembling various pieces of information into a structured format for storage or transmission. The various pieces of information may include, for example, coded or uncoded video, coded or uncoded depth indicators and/or information, and coded or uncoded elements such as, for example, motion vectors, coding mode indicators, and syntax elements. In some implementations, theencoder 902 includes theprocessor 901 and therefore performs the operations of theprocessor 901. - The
transmitter 904 receives the encoded signal(s) from theencoder 902 and transmits the encoded signal(s) in one or more output bitstreams. Thetransmitter 904 may be, for example, adapted to transmit a program signal having one or more bitstreams representing encoded pictures and/or information related thereto. Typical transmitters perform functions such as, for example, one or more of providing error-correction coding, interleaving the data in the signal, randomizing the energy in the signal, and modulating the signal onto one or more carriers using amodulator 906. Thetransmitter 904 may include, or interface with, an antenna (not shown). Further, implementations of thetransmitter 904 may be limited to themodulator 906. - Referring now to
FIG. 10 , a video receiving system/apparatus 1000 is shown to which the features and principles described above may be applied. Thevideo receiving system 1000 may be configured to receive signals over a variety of media, such as, for example, satellite, cable, telephone-line, or terrestrial broadcast. The signals may be received over the Internet or some other network. It should also be clear that the blocks ofFIG. 10 provide a flow diagram of a video receiving process, in addition to providing a block diagram of a video receiving system/apparatus. - The
video receiving system 1000 may be, for example, a cell-phone, a computer, a set-top box, a television, or other device that receives encoded video and provides, for example, decoded video for display to a user, for processing, or for storage. Thus, thevideo receiving system 1000 may provide its output to, for example, a screen of a television, a computer monitor, a computer (for storage, processing, or display), or some other storage, processing, or display device. - The
video receiving system 1000 is capable of receiving and processing video content including video information. Thevideo receiving system 1000 includes areceiver 1002 for receiving an encoded signal, such as for example the signals described in the implementations of this application. Thereceiver 1002 may receive, for example, a signal providing the received picture to thedecoding block 120 ofFIG. 1 , a signal carrying the bitstream from theencoder 700 ofFIG. 7 , or a signal output from thevideo transmission system 900 ofFIG. 9 . - The
receiver 1002 may be, for example, adapted to receive a program signal having a plurality of bitstreams representing encoded pictures. Typical receivers perform functions such as, for example, one or more of receiving a modulated and encoded data signal, demodulating the data signal from one or more carriers using ademodulator 1004, de-randomizing the energy in the signal, de-interleaving the data in the signal, and error-correction decoding the signal. Thereceiver 1002 may include, or interface with, an antenna (not shown). Implementations of thereceiver 1002 may be limited to thedemodulator 1004. - The
video receiving system 1000 includes adecoder 1006. Thereceiver 1002 provides a received signal to thedecoder 1006. Thedecoder 1006 outputs a decoded signal, such as, for example, decoded video signals including video information. Thedecoder 1006 may be, for example, thedecoder 156 or thedecoder 170 of thesystem 100 ofFIG. 1 , or thedecoder 800 ofFIG. 8 . - The output video from the
decoder 1006 is provided, in one implementation, to aprocessing device 1008. Theprocessing device 1008 is, in one implementation, a processor configured for performing deinterleaving and up-sampling as described above for thesystem 100 with respect to the deinterleaving that results in the horizontalsplit screen picture 174 as well as the operations of thesampler 180 and thesampler 182. In some implementations, thedecoder 1006 includes theprocessor 1008 and therefore performs the operations of theprocessor 1008. In other implementations, theprocessor 1008 is part of a downstream device such as, for example, a set-top box or a television. - We thus provide one or more implementations having particular features and aspects. However, features and aspects of described implementations may also be adapted for other implementations.
-
- For example, the above features, aspects, and implementations may be applied or adapted to other systems that are not restricted to left/right stereo systems. One such implementation interleaves a video picture and its corresponding depth picture. Another such implementation interleaves two or more different views from a multi-view system that are not necessarily related as left and right views.
- As another example, the above implementations generally describe interleaving at a macroblock level. However, interleaving is performed at other levels in other implementations. Such other levels include, for example, a field level, a slice level, and a partition level.
- As yet another example, these implementations and features may be used in the context of coding video and/or coding other types of data. Additionally, these implementations and features may be used in the context of, or adapted for use in the context of, a standard. Such standards include, for example, AVC, the extension of AVC for multi-view coding (MVC), the extension of AVC for scalable video coding (SVC), and any proposed MPEG/JVT standards for 3-D Video coding (3DV) and for High-Performance Video Coding (HVC), but other standards (existing or future) may be used. Of course, the implementations and features need not be used in a standard.
- Various implementations refer to “images” and/or “pictures”. The terms “image” and “picture” are used interchangeably throughout this document, and are intended to be broad terms. An “image” or a “picture” may be, for example, all or part of a frame or of a field. The term “video” refers to a sequence of images (or pictures). An image, or a picture, may include, for example, any of various video components or their combinations. Such components, or their combinations, include, for example, luminance, chrominance, Y (of YUV or YCbCr or YPbPr), U (of YUV), V (of YUV), Cb (of YCbCr), Cr (of YCbCr), Pb (of YPbPr), Pr (of YPbPr), red (of RGB), green (of RGB), blue (of RGB), S-Video, and negatives or positives of any of these components. An “image” or a “picture” may also, or alternatively, refer to various different types of content, including, for example, typical two-dimensional video, a disparity map for a 2D video picture, a depth map that corresponds to a 2D video picture, or an edge map.
- Additionally, this application or its claims may refer to “determining” various pieces of information. Determining the information may include one or more of, for example, estimating the information, calculating the information, predicting the information, identifying the information, or retrieving the information from memory.
- It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) without the second listed option (B), or the selection of the second listed option (B) without the first listed option (A), or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C” and “at least one of A, B, or C”, such phrasing is intended to encompass the selection of the first listed option (A) without the second (B) and third (C) listed options, or the selection of the second listed option (B) without the selection of the first (A) and third (C) listed options, or the selection of the third listed option (C) without the selection of the first (A) and second (B) listed options, or the selection of the first and the second listed options (A and B) without the selection of the third listed option (C), or the selection of the first and third listed options (A and C) without the selection of the second listed option (B), or the selection of the second and third listed options (B and C) without the selection of the first listed option (A), or the selection of all three options (A and B and C). This may be extended, as will be readily apparent to one of ordinary skill in this and related arts, for lists of any size. Note that none of the phrasing discussed in this paragraph is intended to limit the selection so as not to include elements that are not listed. For example, “A and/or B” does not preclude the selection of “A” and “C”.
- Reference to “one embodiment” or “an embodiment” or “one implementation” or “an implementation” of the present principles, as well as other variations thereof, mean that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase “in one embodiment” or “in an embodiment” or “in one implementation” or “in an implementation”, as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment. Additionally, these phrases (for example, “in one embodiment”) are not intended to indicate that there is only one possible embodiment but rather to draw attention to the fact that a particular embodiment is being discussed.
- The implementations described herein may be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed may also be implemented in other forms (for example, an apparatus or program). An apparatus may be implemented in, for example, appropriate hardware, software, and/or firmware. The methods may be implemented in, for example, an apparatus such as, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants (“PDAs”), and other devices that facilitate communication of information between end-users or devices.
- Implementations of the various processes and features described herein may be embodied in a variety of different equipment or applications, particularly, for example, equipment or applications associated with data encoding and decoding. Examples of such equipment include an encoder, a decoder, a post-processor processing output from a decoder, a pre-processor providing input to an encoder, a video coder, a video decoder, a video codec, a web server, a set-top box, a laptop, a personal computer, a cell phone, a PDA, and other communication devices. As should be clear, the equipment may be mobile and even installed in a mobile vehicle.
- Additionally, the methods may be implemented by instructions being performed by a processor, and such instructions (and/or data values produced by an implementation) may be stored on a processor-readable medium such as, for example, an integrated circuit, a software carrier or other storage device such as, for example, a hard disk, a compact diskette, a random access memory (“RAM”), or a read-only memory (“ROM”). The instructions may form an application program tangibly embodied on a processor-readable medium. Instructions may be, for example, in hardware, firmware, software, or a combination. Instructions may be found in, for example, an operating system, a separate application, or a combination of the two. A processor may be characterized, therefore, as, for example, both a device configured to carry out a process and a device that includes a processor-readable medium (such as a storage device) having instructions for carrying out a process. Further, a processor-readable medium may store, in addition to or in lieu of instructions, data values produced by an implementation.
- As will be evident to one of skill in the art, implementations may produce a variety of signals formatted to carry information that may be, for example, stored or transmitted. The information may include, for example, instructions for performing a method, or data produced by one of the described implementations. Such a signal may be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal. The formatting may include, for example, encoding a data stream and modulating a carrier with the encoded data stream. The information that the signal carries may be, for example, analog or digital information. The signal may be transmitted over a variety of different wired or wireless links, as is known. The signal may be stored on a processor-readable medium.
- A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made. For example, elements of different implementations may be combined, supplemented, modified, or removed to produce other implementations. Additionally, one of ordinary skill will understand that other structures and processes may be substituted for those disclosed and the resulting implementations will perform at least substantially the same function(s), in at least substantially the same way(s), to achieve at least substantially the same result(s) as the implementations disclosed. Accordingly, these and other implementations are contemplated by this disclosure and are within the scope of this disclosure.
Claims (48)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/575,803 US9215445B2 (en) | 2010-01-29 | 2011-01-28 | Block-based interleaving |
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US33706010P | 2010-01-29 | 2010-01-29 | |
| PCT/US2011/000168 WO2011094019A1 (en) | 2010-01-29 | 2011-01-28 | Block-based interleaving |
| US13/575,803 US9215445B2 (en) | 2010-01-29 | 2011-01-28 | Block-based interleaving |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20120300843A1 true US20120300843A1 (en) | 2012-11-29 |
| US9215445B2 US9215445B2 (en) | 2015-12-15 |
Family
ID=43836663
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/575,803 Expired - Fee Related US9215445B2 (en) | 2010-01-29 | 2011-01-28 | Block-based interleaving |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US9215445B2 (en) |
| EP (1) | EP2529557A1 (en) |
| JP (1) | JP5722349B2 (en) |
| KR (1) | KR101828096B1 (en) |
| CN (1) | CN102742282B (en) |
| BR (1) | BR112012018976A2 (en) |
| WO (1) | WO2011094019A1 (en) |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120314023A1 (en) * | 2010-02-24 | 2012-12-13 | Jesus Barcons-Palau | Split screen for 3d |
| US20130100121A1 (en) * | 2011-10-20 | 2013-04-25 | Samsung Electronics Co., Ltd. | Display driver and method of operating image data processing device |
| US20150326882A1 (en) * | 2009-01-29 | 2015-11-12 | Dolby Laboratories Licensing Corporation | Coding and Decoding of Interleaved Image Data |
| US20160014426A1 (en) * | 2014-07-08 | 2016-01-14 | Brain Corporation | Apparatus and methods for distance estimation using stereo imagery |
| US20170180740A1 (en) * | 2013-04-16 | 2017-06-22 | Fastvdo Llc | Adaptive coding, transmission and efficient display of multimedia (acted) |
| US10055850B2 (en) | 2014-09-19 | 2018-08-21 | Brain Corporation | Salient features tracking apparatus and methods using visual initialization |
| US10194172B2 (en) | 2009-04-20 | 2019-01-29 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US20190253673A1 (en) * | 2015-01-11 | 2019-08-15 | A.A.A Taranis Visual Ltd | Systems and methods for agricultural monitoring |
| EP3979644A1 (en) * | 2020-10-02 | 2022-04-06 | Koninklijke Philips N.V. | A method and apparatus for encoding and decoding one or more views of a scene |
| US12363325B2 (en) * | 2022-10-26 | 2025-07-15 | Electronics And Telecommunications Research Institute | Method and apparatus for image encoding, and method and apparatus for image decoding |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10045014B2 (en) * | 2013-07-15 | 2018-08-07 | Mediatek Singapore Pte. Ltd. | Method of disparity derived depth coding in 3D video coding |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070014360A1 (en) * | 2005-07-13 | 2007-01-18 | Polycom, Inc. | Video error concealment method |
| US20090002481A1 (en) * | 2007-06-26 | 2009-01-01 | Samsung Electronics Co., Ltd. | Method and apparatus for generating stereoscopic image bitstream using block interleaved method |
| US20110170792A1 (en) * | 2008-09-23 | 2011-07-14 | Dolby Laboratories Licensing Corporation | Encoding and Decoding Architecture of Checkerboard Multiplexed Image Data |
| US8102920B2 (en) * | 2007-07-04 | 2012-01-24 | Lg Electronics Inc. | Digital broadcasting system and data processing method |
| US8885721B2 (en) * | 2008-07-20 | 2014-11-11 | Dolby Laboratories Licensing Corporation | Encoder optimization of stereoscopic video delivery systems |
| US9025670B2 (en) * | 2009-01-29 | 2015-05-05 | Dolby Laboratories Licensing Corporation | Methods and devices for sub-sampling and interleaving multiple images, EG stereoscopic |
Family Cites Families (67)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5193000A (en) | 1991-08-28 | 1993-03-09 | Stereographics Corporation | Multiplexing technique for stereoscopic video system |
| US5689641A (en) | 1993-10-01 | 1997-11-18 | Vicor, Inc. | Multimedia collaboration system arrangement for routing compressed AV signal through a participant site without decompressing the AV signal |
| US5748786A (en) | 1994-09-21 | 1998-05-05 | Ricoh Company, Ltd. | Apparatus for compression using reversible embedded wavelets |
| US6055012A (en) | 1995-12-29 | 2000-04-25 | Lucent Technologies Inc. | Digital multi-view video compression with complexity and compatibility constraints |
| DE19619598A1 (en) | 1996-05-15 | 1997-11-20 | Deutsche Telekom Ag | Methods for storing or transmitting stereoscopic video signals |
| US6075905A (en) | 1996-07-17 | 2000-06-13 | Sarnoff Corporation | Method and apparatus for mosaic image construction |
| US6173087B1 (en) | 1996-11-13 | 2001-01-09 | Sarnoff Corporation | Multi-view image registration with application to mosaicing and lens distortion correction |
| US6157396A (en) | 1999-02-16 | 2000-12-05 | Pixonics Llc | System and method for using bitstream information to process images for use in digital display systems |
| US6390980B1 (en) | 1998-12-07 | 2002-05-21 | Atl Ultrasound, Inc. | Spatial compounding with ultrasonic doppler signal information |
| US6223183B1 (en) | 1999-01-29 | 2001-04-24 | International Business Machines Corporation | System and method for describing views in space, time, frequency, and resolution |
| US7254265B2 (en) | 2000-04-01 | 2007-08-07 | Newsight Corporation | Methods and systems for 2D/3D image conversion and optimization |
| DE10016074B4 (en) | 2000-04-01 | 2004-09-30 | Tdv Technologies Corp. | Method and device for generating 3D images |
| CN1210645C (en) | 2000-09-18 | 2005-07-13 | 国际商业机器公司 | Method of managing views on a computer display |
| KR20040071145A (en) | 2001-11-24 | 2004-08-11 | 티디브이 테크놀러지스 코포레이션 | Generation of a stereo image sequence from a 2d image sequence |
| US7263240B2 (en) | 2002-01-14 | 2007-08-28 | Eastman Kodak Company | Method, system, and software for improving signal quality using pyramidal decomposition |
| CA2380105A1 (en) | 2002-04-09 | 2003-10-09 | Nicholas Routhier | Process and system for encoding and playback of stereoscopic video sequences |
| AU2003235940A1 (en) | 2002-04-25 | 2003-11-10 | Sharp Kabushiki Kaisha | Image encodder, image decoder, record medium, and image recorder |
| JP4154569B2 (en) | 2002-07-10 | 2008-09-24 | 日本電気株式会社 | Image compression / decompression device |
| US7489342B2 (en) | 2004-12-17 | 2009-02-10 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for managing reference pictures in multiview videos |
| US20040260827A1 (en) | 2003-06-19 | 2004-12-23 | Nokia Corporation | Stream switching based on gradual decoder refresh |
| US7496234B2 (en) | 2003-06-20 | 2009-02-24 | Microsoft Corporation | System and method for seamless multiplexing of embedded bitstreams |
| KR100519776B1 (en) | 2003-11-24 | 2005-10-07 | 삼성전자주식회사 | Method and apparatus for converting resolution of video signal |
| KR100587952B1 (en) | 2003-12-05 | 2006-06-08 | 한국전자통신연구원 | Image encoding / decoding device and method thereof for performing compensation by reduction of left and right images to asymmetric size |
| GB2412519B (en) | 2004-03-23 | 2010-11-03 | British Broadcasting Corp | Monitoring system |
| JP4542447B2 (en) | 2005-02-18 | 2010-09-15 | 株式会社日立製作所 | Image encoding / decoding device, encoding / decoding program, and encoding / decoding method |
| KR100679740B1 (en) | 2004-06-25 | 2007-02-07 | 학교법인연세대학교 | Multi-view video encoding / decoding method with view selection |
| CN101977329B (en) | 2004-07-29 | 2012-10-03 | 微软公司 | Image processing using linear light values and other image processing improvements |
| EP1800493A4 (en) | 2004-10-16 | 2012-10-10 | Korea Electronics Telecomm | METHOD AND SYSTEM FOR MULTILOOK VIDEO CODING / DECODING BASED ON LAYERED DEPTH IMAGE |
| EP1667448A1 (en) | 2004-12-02 | 2006-06-07 | Deutsche Thomson-Brandt Gmbh | Method and apparatus for encoding and for decoding a main video signal and one or more auxiliary video signals |
| US7728878B2 (en) | 2004-12-17 | 2010-06-01 | Mitsubishi Electric Research Labortories, Inc. | Method and system for processing multiview videos for view synthesis using side information |
| US20060176318A1 (en) | 2005-02-09 | 2006-08-10 | Martin Virginia L | Method for manipulating artwork to form decorative pieces |
| US8228994B2 (en) | 2005-05-20 | 2012-07-24 | Microsoft Corporation | Multi-view video coding based on temporal and view decomposition |
| KR100716999B1 (en) | 2005-06-03 | 2007-05-10 | 삼성전자주식회사 | Intra prediction method using image symmetry, image decoding, encoding method and apparatus using same |
| EP1897380B1 (en) | 2005-06-23 | 2015-11-18 | Koninklijke Philips N.V. | Combined exchange of image and related depth data |
| US7668366B2 (en) | 2005-08-09 | 2010-02-23 | Seiko Epson Corporation | Mosaic image data processing |
| US20100158133A1 (en) | 2005-10-12 | 2010-06-24 | Peng Yin | Method and Apparatus for Using High-Level Syntax in Scalable Video Encoding and Decoding |
| MY159176A (en) * | 2005-10-19 | 2016-12-30 | Thomson Licensing | Multi-view video coding using scalable video coding |
| US7903737B2 (en) | 2005-11-30 | 2011-03-08 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for randomly accessing multiview videos with known prediction dependency |
| US8457219B2 (en) | 2005-12-30 | 2013-06-04 | Ikanos Communications, Inc. | Self-protection against non-stationary disturbances |
| ZA200805337B (en) | 2006-01-09 | 2009-11-25 | Thomson Licensing | Method and apparatus for providing reduced resolution update mode for multiview video coding |
| JP5192393B2 (en) | 2006-01-12 | 2013-05-08 | エルジー エレクトロニクス インコーポレイティド | Multi-view video processing |
| US20070205367A1 (en) | 2006-03-01 | 2007-09-06 | General Electric Company | Apparatus and method for hybrid computed tomography imaging |
| KR101245251B1 (en) | 2006-03-09 | 2013-03-19 | 삼성전자주식회사 | Method and apparatus for encoding and decoding multi-view video to provide uniform video quality |
| JP2008034892A (en) * | 2006-03-28 | 2008-02-14 | Victor Co Of Japan Ltd | Multi-viewpoint image encoder |
| RU2529881C2 (en) | 2006-03-29 | 2014-10-10 | Томсон Лайсенсинг | Methods and device for use in multi-view video coding system |
| US7609906B2 (en) | 2006-04-04 | 2009-10-27 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for acquiring and displaying 3D light fields |
| US8139142B2 (en) | 2006-06-01 | 2012-03-20 | Microsoft Corporation | Video manipulation of red, green, blue, distance (RGB-Z) data including segmentation, up-sampling, and background substitution techniques |
| WO2008024345A1 (en) | 2006-08-24 | 2008-02-28 | Thomson Licensing | Adaptive region-based flipping video coding |
| PL2103136T3 (en) | 2006-12-21 | 2018-02-28 | Thomson Licensing | Methods and apparatus for improved signaling using high level syntax for multi-view video coding and decoding |
| US8515194B2 (en) | 2007-02-21 | 2013-08-20 | Microsoft Corporation | Signaling and uses of windowing information for images |
| BR122018004903B1 (en) | 2007-04-12 | 2019-10-29 | Dolby Int Ab | video coding and decoding tiling |
| WO2008140190A1 (en) | 2007-05-14 | 2008-11-20 | Samsung Electronics Co, . Ltd. | Method and apparatus for encoding and decoding multi-view image |
| JP4418827B2 (en) | 2007-05-16 | 2010-02-24 | 三菱電機株式会社 | Image display apparatus and method, and image generation apparatus and method |
| KR100962696B1 (en) | 2007-06-07 | 2010-06-11 | 주식회사 이시티 | Construction method of encoded stereoscopic video data file |
| US8373744B2 (en) | 2007-06-07 | 2013-02-12 | Reald Inc. | Stereoplexing for video and film applications |
| JP2010530702A (en) | 2007-06-19 | 2010-09-09 | 韓國電子通信研究院 | Metadata structure for storing and reproducing stereoscopic data, and method for storing stereoscopic content file using the same |
| JP5068598B2 (en) | 2007-08-01 | 2012-11-07 | 富士通コンポーネント株式会社 | Printer device |
| MY162861A (en) * | 2007-09-24 | 2017-07-31 | Koninl Philips Electronics Nv | Method and system for encoding a video data signal, encoded video data signal, method and system for decoding a video data signal |
| US8218855B2 (en) | 2007-10-04 | 2012-07-10 | Samsung Electronics Co., Ltd. | Method and apparatus for receiving multiview camera parameters for stereoscopic image, and method and apparatus for transmitting multiview camera parameters for stereoscopic image |
| KR100918862B1 (en) | 2007-10-19 | 2009-09-28 | 광주과학기술원 | Method and device for generating depth image using reference image, and method for encoding or decoding the said depth image, and encoder or decoder for the same, and the recording media storing the image generating the said method |
| EP2235957A1 (en) | 2007-12-20 | 2010-10-06 | Koninklijke Philips Electronics N.V. | Image encoding method for stereoscopic rendering |
| KR101506217B1 (en) | 2008-01-31 | 2015-03-26 | 삼성전자주식회사 | A method and apparatus for generating a stereoscopic image data stream for reproducing a partial data section of a stereoscopic image, and a method and an apparatus for reproducing a partial data section of a stereoscopic image |
| US20090219985A1 (en) * | 2008-02-28 | 2009-09-03 | Vasanth Swaminathan | Systems and Methods for Processing Multiple Projections of Video Data in a Single Video File |
| US8878836B2 (en) | 2008-02-29 | 2014-11-04 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding datastream including additional information on multiview image and method and apparatus for decoding datastream by using the same |
| KR101506219B1 (en) | 2008-03-25 | 2015-03-27 | 삼성전자주식회사 | Method and apparatus for providing and reproducing 3 dimensional video content, and computer readable medium thereof |
| US8106924B2 (en) | 2008-07-31 | 2012-01-31 | Stmicroelectronics S.R.L. | Method and system for video rendering, computer program product therefor |
| EP2197217A1 (en) | 2008-12-15 | 2010-06-16 | Koninklijke Philips Electronics N.V. | Image based 3D video format |
-
2011
- 2011-01-28 KR KR1020127022539A patent/KR101828096B1/en not_active Expired - Fee Related
- 2011-01-28 US US13/575,803 patent/US9215445B2/en not_active Expired - Fee Related
- 2011-01-28 CN CN201180007731.3A patent/CN102742282B/en not_active Expired - Fee Related
- 2011-01-28 EP EP11705708A patent/EP2529557A1/en not_active Withdrawn
- 2011-01-28 WO PCT/US2011/000168 patent/WO2011094019A1/en not_active Ceased
- 2011-01-28 BR BR112012018976A patent/BR112012018976A2/en not_active Application Discontinuation
- 2011-01-28 JP JP2012551176A patent/JP5722349B2/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070014360A1 (en) * | 2005-07-13 | 2007-01-18 | Polycom, Inc. | Video error concealment method |
| US20090002481A1 (en) * | 2007-06-26 | 2009-01-01 | Samsung Electronics Co., Ltd. | Method and apparatus for generating stereoscopic image bitstream using block interleaved method |
| US8471893B2 (en) * | 2007-06-26 | 2013-06-25 | Samsung Electronics Co., Ltd. | Method and apparatus for generating stereoscopic image bitstream using block interleaved method |
| US8102920B2 (en) * | 2007-07-04 | 2012-01-24 | Lg Electronics Inc. | Digital broadcasting system and data processing method |
| US8885721B2 (en) * | 2008-07-20 | 2014-11-11 | Dolby Laboratories Licensing Corporation | Encoder optimization of stereoscopic video delivery systems |
| US20110170792A1 (en) * | 2008-09-23 | 2011-07-14 | Dolby Laboratories Licensing Corporation | Encoding and Decoding Architecture of Checkerboard Multiplexed Image Data |
| US9025670B2 (en) * | 2009-01-29 | 2015-05-05 | Dolby Laboratories Licensing Corporation | Methods and devices for sub-sampling and interleaving multiple images, EG stereoscopic |
Cited By (39)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10362334B2 (en) | 2009-01-29 | 2019-07-23 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US20250220229A1 (en) * | 2009-01-29 | 2025-07-03 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US11973980B2 (en) | 2009-01-29 | 2024-04-30 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US20150326882A1 (en) * | 2009-01-29 | 2015-11-12 | Dolby Laboratories Licensing Corporation | Coding and Decoding of Interleaved Image Data |
| US20240080479A1 (en) * | 2009-01-29 | 2024-03-07 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US9420311B2 (en) * | 2009-01-29 | 2016-08-16 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US12081797B2 (en) * | 2009-01-29 | 2024-09-03 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US9877047B2 (en) | 2009-01-29 | 2018-01-23 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US9877046B2 (en) | 2009-01-29 | 2018-01-23 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US11622130B2 (en) | 2009-01-29 | 2023-04-04 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US12096029B2 (en) | 2009-01-29 | 2024-09-17 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US11284110B2 (en) * | 2009-01-29 | 2022-03-22 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US12519977B2 (en) * | 2009-01-29 | 2026-01-06 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US10382788B2 (en) | 2009-01-29 | 2019-08-13 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US10701397B2 (en) | 2009-01-29 | 2020-06-30 | Dolby Laboratories Licensing Corporation | Coding and decoding of interleaved image data |
| US11477480B2 (en) | 2009-04-20 | 2022-10-18 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US11792429B2 (en) | 2009-04-20 | 2023-10-17 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US10609413B2 (en) | 2009-04-20 | 2020-03-31 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US12058371B2 (en) | 2009-04-20 | 2024-08-06 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US12206908B2 (en) | 2009-04-20 | 2025-01-21 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US10194172B2 (en) | 2009-04-20 | 2019-01-29 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US11792428B2 (en) | 2009-04-20 | 2023-10-17 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US12114021B1 (en) | 2009-04-20 | 2024-10-08 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US12058372B2 (en) | 2009-04-20 | 2024-08-06 | Dolby Laboratories Licensing Corporation | Directed interpolation and data post-processing |
| US20120314023A1 (en) * | 2010-02-24 | 2012-12-13 | Jesus Barcons-Palau | Split screen for 3d |
| US20130100121A1 (en) * | 2011-10-20 | 2013-04-25 | Samsung Electronics Co., Ltd. | Display driver and method of operating image data processing device |
| US8970605B2 (en) * | 2011-10-20 | 2015-03-03 | Samsung Electronics Co., Ltd. | Display driver with improved power consumption and operation method of improving power consumption of image data processing device |
| US20170180740A1 (en) * | 2013-04-16 | 2017-06-22 | Fastvdo Llc | Adaptive coding, transmission and efficient display of multimedia (acted) |
| US10306238B2 (en) * | 2013-04-16 | 2019-05-28 | Fastvdo Llc | Adaptive coding, transmission and efficient display of multimedia (ACTED) |
| US10057593B2 (en) * | 2014-07-08 | 2018-08-21 | Brain Corporation | Apparatus and methods for distance estimation using stereo imagery |
| US20160014426A1 (en) * | 2014-07-08 | 2016-01-14 | Brain Corporation | Apparatus and methods for distance estimation using stereo imagery |
| US10055850B2 (en) | 2014-09-19 | 2018-08-21 | Brain Corporation | Salient features tracking apparatus and methods using visual initialization |
| US10268919B1 (en) | 2014-09-19 | 2019-04-23 | Brain Corporation | Methods and apparatus for tracking objects using saliency |
| US11050979B2 (en) * | 2015-01-11 | 2021-06-29 | A.A.A. Taranis Visual Ltd | Systems and methods for agricultural monitoring |
| US20190253673A1 (en) * | 2015-01-11 | 2019-08-15 | A.A.A Taranis Visual Ltd | Systems and methods for agricultural monitoring |
| US20230370600A1 (en) * | 2020-10-02 | 2023-11-16 | Koninklijke Philips N.V. | A method and apparatus for encoding and decoding one or more views of a scene |
| WO2022069388A1 (en) * | 2020-10-02 | 2022-04-07 | Koninklijke Philips N.V. | A method and apparatus for encoding and decoding one or more views of a scene |
| EP3979644A1 (en) * | 2020-10-02 | 2022-04-06 | Koninklijke Philips N.V. | A method and apparatus for encoding and decoding one or more views of a scene |
| US12363325B2 (en) * | 2022-10-26 | 2025-07-15 | Electronics And Telecommunications Research Institute | Method and apparatus for image encoding, and method and apparatus for image decoding |
Also Published As
| Publication number | Publication date |
|---|---|
| CN102742282B (en) | 2017-09-08 |
| US9215445B2 (en) | 2015-12-15 |
| CN102742282A (en) | 2012-10-17 |
| JP2013518515A (en) | 2013-05-20 |
| KR101828096B1 (en) | 2018-02-09 |
| WO2011094019A1 (en) | 2011-08-04 |
| JP5722349B2 (en) | 2015-05-20 |
| BR112012018976A2 (en) | 2018-03-27 |
| EP2529557A1 (en) | 2012-12-05 |
| KR20120123492A (en) | 2012-11-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9215445B2 (en) | Block-based interleaving | |
| US11032545B2 (en) | Reducing seam artifacts in 360-degree video | |
| EP2920759B1 (en) | Processing high dynamic range images | |
| US9552633B2 (en) | Depth aware enhancement for stereo video | |
| US20110038418A1 (en) | Code of depth signal | |
| WO2018087425A1 (en) | An apparatus, a method and a computer program for video coding and decoding | |
| US20130010863A1 (en) | Merging encoded bitstreams | |
| KR20140043037A (en) | Method and apparatus for compensating sample adaptive offset for encoding inter layer prediction error | |
| US10542265B2 (en) | Self-adaptive prediction method for multi-layer codec | |
| US20200404339A1 (en) | Loop filter apparatus and method for video coding | |
| CN115988202B (en) | Apparatus and method for intra prediction | |
| WO2013105946A1 (en) | Motion compensating transformation for video coding | |
| EP4503605A1 (en) | Adaptive mts-based image encoding/decoding method, device, and recording medium for storing bitstream | |
| EP3804338B1 (en) | In-loop deblocking filter apparatus and method for video coding | |
| HK1215486B (en) | Processing high dynamic range images |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: THOMSON LICENSING, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HORLANDER, THOMAS EDWARD;DORINI, BRIAN JOSEPH;REEL/FRAME:028660/0219 Effective date: 20110223 |
|
| ZAAA | Notice of allowance and fees due |
Free format text: ORIGINAL CODE: NOA |
|
| ZAAB | Notice of allowance mailed |
Free format text: ORIGINAL CODE: MN/=. |
|
| ZAAA | Notice of allowance and fees due |
Free format text: ORIGINAL CODE: NOA |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| AS | Assignment |
Owner name: THOMSON LICENSING DTV, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:THOMSON LICENSING;REEL/FRAME:041370/0433 Effective date: 20170113 |
|
| AS | Assignment |
Owner name: THOMSON LICENSING DTV, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:THOMSON LICENSING;REEL/FRAME:041378/0630 Effective date: 20170113 |
|
| AS | Assignment |
Owner name: INTERDIGITAL MADISON PATENT HOLDINGS, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:THOMSON LICENSING DTV;REEL/FRAME:046763/0001 Effective date: 20180723 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
| FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
| FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20231215 |