HK1113886A - Method and apparatus for error recovery using intra-slice resynchronization points - Google Patents
Method and apparatus for error recovery using intra-slice resynchronization points Download PDFInfo
- Publication number
- HK1113886A HK1113886A HK08109569.4A HK08109569A HK1113886A HK 1113886 A HK1113886 A HK 1113886A HK 08109569 A HK08109569 A HK 08109569A HK 1113886 A HK1113886 A HK 1113886A
- Authority
- HK
- Hong Kong
- Prior art keywords
- information
- resynchronization point
- bitstream
- decoding
- resynchronization
- Prior art date
Links
Abstract
A method of video coding that includes encoding resynchronization point information, where the resynchronization point information includes information identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point. Also, a method for decoding digital video that includes receiving an encoded bitstream including resynchronization point information, where the resynchronization point information includes information identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point, decoding the received bitstream, and locating the resynchronization point in the bitstream based on the resynchronization point information.
Description
Claiming priority in accordance with 35 U.S.C. § 119
This patent application claims provisional application No. 60/660,879 entitled "ContentClassification for Video Processing" and provisional application No. 60/713,207 entitled "Method and apparatus for Error Recovery Using Intra-Slice regeneration Points", both of which are assigned to the assignee of the present invention and expressly incorporated herein by reference, as filed prior to 10.3.2005.
Technical Field
The present invention relates to methods and apparatuses for encoding and decoding digital data using an error management scheme.
Background
In mobile communication systems, the demand for higher data rates and higher quality of service is increasing rapidly. However, factors such as limited transmit power, limited bandwidth, and multipath fading continue to limit the data rates handled by practical systems. In multimedia communications, particularly in error-prone environments, the error resilience of the transmitted media is critical in providing the desired quality of service, since errors in even a single decoded value can cause decoding artifacts to propagate spatially and temporally. Various encoding measures have been used to minimize errors while maintaining the necessary data rate, however, all of these techniques suffer from the problem that errors can reach the decoder side.
Hybrid coding standards, such as MPEG-1, MPEG-2, MPEG-4 (collectively referred to as MPEG-x), h.261, h.262, h.263, and h.264 (collectively referred to as h.26x), describe data processing and manipulation techniques (referred to herein as hybrid coding) that are well suited for compressing and delivering video, audio, and other information using fixed or variable length source coding techniques. In particular, the above-mentioned standards and other hybrid coding standards and techniques illustratively compress video information using intra-frame entropy coding techniques (e.g., run-length coding, Huffman coding, and the like) and inter-frame coding techniques (e.g., forward and backward predictive coding, motion compensation, and the like). In particular, in the case of video processing systems, a feature of hybrid video coding systems is prediction-based compression coding of video frames using intra-frame and/or inter-frame motion compensation coding.
Entropy coding enables a very efficient lossless representation of symbols generated by a random information source. Therefore, it is an indispensable component in both lossless and lossy data compression schemes. Entropy coding, while very beneficial to compression efficiency, also complicates the decoding process. A common feature of all different entropy coding methods is that a single or a series of source symbols is associated with and represented by a binary pattern, i.e. a series of 1's and 0's called a codeword, whose length increases with decreasing symbol likelihood. Thus, more likely symbols would be assigned to a more compact representation, thereby enabling significant savings, on average, over direct symbol letter sizes based on fixed length representations.
The ambiguity as to how many bits the next symbol will consume in the bitstream (i.e. the entropy encoded representation of the output of the information source) is quite complex for the decoder. More importantly, however, the use of variable size codewords in combination with flipped bits (because of errors) may often result in emulating incorrect codeword lengths in the event of errors in the bitstream and, as a result, the parsing/decoding process may lose its synchronization with the bitstream, i.e., may begin to fail to correctly identify codeword boundaries and thus fail to correctly interpret the bitstream.
It is assumed that a decoder performing a basic level of error detection measures encounters problems in decoding the bitstream and loses synchronization. Eventually, either due to syntax violations (i.e., codeword invalidity) or semantic incorrectness (e.g., parameter value invalidity or occurrence of an undesired bitstream object), the decoder may begin to recognize the problem and take the necessary measures to resynchronize itself with the bitstream. This may result in data loss to a much greater extent than the corruption that initially triggered the data loss. Due to the spatial prediction used in digital compression, data loss may spread spatially over the entire frame. Data loss is also exacerbated if the lost data is part of a reference frame for a motion compensated prediction region, resulting in the propagation of errors over time.
The MPEG-x and h.26x hybrid coding standards typically provide resynchronization points (RSPs) at NALU (network abstraction layer unit) boundaries, the most common NALUs being slices. A slice may be a group of consecutive macroblocks in raster scan order, where a macroblock consists of 16 x 16 pixels. A pixel is defined by a luminance value (Y) and two chrominance values (Cr and Cb). In h.264, Y, Cr and Cb components are stored in a 4:2:0 format, where the Cr and Cb components are down-sampled by a factor of 2 in the X and Y directions. Thus, each macroblock will also consist of 256Y components, 64 Cr components, and 64 Cb components. H.264 generalizes the slice concept by introducing slice groups and Flexible Macroblock Ordering (FMO). Slice groups and FMO enable slice-to-macroblock association to be completely arbitrary, providing flexibility far beyond traditional contiguous macroblock structures. Slices are started from RSP called prefix codes. The RSP prefix code is a byte-positioned, reserved bit string codeword that is an order of magnitude three bytes long. To be used as a true resynchronization point, all inter-code prediction chains avoid mentioning data before the RSP. The overhead due to prefix code bytes and the coding efficiency loss due to predictive coding chain outages or degradations are among the drawbacks of frequent use of slices (i.e., use of short slices), which negate their advantages in supporting error resilience. It is not uncommon to encode an entire frame into a single slice as a default encoder behavior, depending on the concerns. Another popular shorter slice structure is to have each macroblock row constitute a slice. Slices shorter than the macroblock rows are rarely used and when used, the reason is mostly to match the slice size (in number of bits) to the packet size that needs to be carried.
In conventional slice-based resynchronization schemes, if an error, such as a semantic or syntax error (for example), is detected in the data at the decoder, the entire slice that occurs after the error is made useless. This is not a desirable state, especially for longer slices of, for example, an entire frame. There IS a need for an on-chip resynchronization point (IS-RSP) that can salvage some video data contained in corrupt slices. In addition, it IS desirable to intelligently locate the IS-RSP to maximize error resilience while reducing overhead.
Disclosure of Invention
A method and apparatus for video encoding includes methods and means for: encoding resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and transmitting the encoded resynchronization point information.
In another aspect, an apparatus for video encoding includes: an encoder for encoding resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and a communicator configured to transmit the encoded resynchronization point information.
In the above aspect, the method and apparatus for video encoding may further comprise method or means for: calculating rate-distortion costs for a plurality of candidate locations; and selecting at least one of the candidate locations as the location of the resynchronization point according to the calculated rate-distortion cost. The method and apparatus for video encoding may further comprise method or means for: selecting a location of the resynchronization point within a section of the video bitstream, wherein the section is a member of a group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames. The method and apparatus for video encoding may further comprise method or means for: selecting a location of the resynchronization point within a section of the video bitstream, wherein the resynchronization point is a start of a macroblock.
The information for decoding the bitstream may comprise information related to adjacent video segments. The information for decoding the bitstream may comprise a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients. The information for decoding the bitstream may include information related to a context in which the bitstream follows the resynchronization point.
The method and apparatus for video encoding may further comprise method or means for: encoding the resynchronization point information in a data message, wherein the data message is one of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message. The information for decoding the bitstream may include information related to a context in which the bitstream follows the resynchronization point.
In another aspect, a method and apparatus for decoding includes a method or means for: receiving an encoded bitstream including resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point; and decoding the received bitstream.
In yet another aspect, an apparatus for decoding video includes: a receiver for receiving an encoded bitstream including resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point; and a decoder for decoding the received bitstream.
In the above, the method and apparatus for decoding may further comprise method or means for: locating a resynchronization point in the bitstream according to the resynchronization point information. The information for decoding the bitstream may include information related to a context in which the bitstream following the resynchronization point is encoded. The method and apparatus for decoding may further comprise method or means for: comparing a current context of the decoded bitstream with received context information included in the resynchronization point information; and if the comparison reveals that the current context is not the same as the received context information, stopping decoding the bitstream and resuming decoding the bitstream at the resynchronization point. The location of the resynchronization point can be within a video segment selected from a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames. The location of the resynchronization point may be the beginning of a macroblock.
The information for decoding the bitstream may comprise information related to adjacent video segments. The information for decoding the bitstream may comprise a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
The method and apparatus for decoding may further comprise method or means for: receiving resynchronization point information in a data message, wherein the data message is a member of a group consisting of an in-band application message, a user specific private data message, an auxiliary enhancement information message, and an MPEG user data message. The method and apparatus for decoding may further comprise method or means for: resynchronization point information encoded using a variable length code is received. The method and apparatus for decoding may further comprise method or means for: detecting an error in the bitstream; stopping decoding of the bitstream; and continuing decoding at the located resynchronization point.
In yet another aspect, a method and apparatus for encoding multimedia data comprises a method or means for: encoding the resynchronization point data; and inserting the resynchronization point data into a multimedia stream slice. The method and apparatus for encoding may further comprise methods or means for: selecting a location of a resynchronization point within the slice; and wherein the inserting comprises inserting the resynchronization point into the selected location. The method and apparatus for selecting may comprise: calculating rate-distortion costs for a plurality of candidate locations; and selecting at least one candidate location based on the rate-distortion cost. The resynchronization point may include context information for the multimedia data.
In yet another aspect, a method and apparatus for processing a multimedia stream includes a method or device for: receiving resynchronization point data in the multimedia stream slice; and reconstructing the multimedia data according to the resynchronization point data. The resynchronization point may include context information for the multimedia data.
In the above aspects, the methods and/or apparatuses may be implemented using a computer-readable medium and/or a processor configured to perform the methods or perform the functions of the apparatuses.
Drawings
Fig. 1 is an illustration of an example of a communication system for delivering streaming video;
FIG. 2 depicts an example of a 16 × 16 pixel macroblock and an adjacent 16 × 16 pixel macroblock;
FIG. 3 illustrates an example of a process for encoding IS-RSP;
FIG. 4 depicts a video frame in which error propagation IS contained by utilizing IS-RSP;
FIG. 5 IS an example of an IS-RSP encoding scheme for identifying a best candidate IS-RSP location;
FIG. 6 depicts an example of error probabilities and associated distortions for use in a rate-distortion analysis for identifying a best candidate IS-RSP location;
FIG. 7 IS an illustration of an example of a decoder process utilizing IS-RSP; and
fig. 8 IS an illustration of another example of a decoder process utilizing IS-RSP.
Fig. 9-10 illustrate example methods and apparatus for encoding IS-RSP.
Fig. 11-12 illustrate example methods and apparatus for decoding.
Fig. 13-14 illustrate example methods and apparatus for encoding IS-RSP.
15-16 illustrate example methods and apparatus for processing multimedia streams.
Detailed Description
Methods and apparatus for providing improved resynchronization points within a section of a video bitstream (e.g., a slice) are described. The encoder may provide resynchronization point information comprising information for locating the resynchronization point, and information for decoding a bitstream following the resynchronization point. Since the resynchronization point is located within a section of the video bitstream, a decoder can resynchronize at the resynchronization point without sacrificing the remaining data in the section. Examples thereof are applicable to multimedia data including video, audio, or both video and audio data. In the following description, specific details are given to provide a thorough understanding of the embodiments. However, it will be understood by those skilled in the art that the embodiments may be practiced without these specific details. For example, electrical components may be shown in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, such components, other structures and techniques may be shown in detail to further explain the embodiments.
It is also noted that the embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently and the process can be repeated. In addition, the order of the jobs may be rearranged. When its operation is completed, the process ends. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination corresponds to a return of the function to the calling function or the main function.
Fig. 1 is an illustration of an example of a communication system for delivering streaming video. System 100 includes an encoder device 105 and a decoder device 110. Encoder device 105 further includes a transformer/quantizer component 115, an entropy coder component 120, an IS-RSP locator component 125, a memory component 130, a processor component 135, and a communications component 140. Processor 135 provides a computing platform to implement the processes of the other components. The transformer/quantizer component 115 transforms the video data from the spatial domain to another domain, for example the frequency domain in the case of DCT (discrete cosine transform). The data to be transformed may be intra-coded data, where the transform is performed on the actual video data, or it may be inter-coded data, where the transform is performed on motion vectors and residual errors. Other digital transforms include the Hadamard transform, DWT (discrete wavelet transform), and integer transforms (e.g., the transform used in h.264).
The transformer/quantizer component 115 allocates a number of bits for representing each transformed coefficient. The quantization of the transformed coefficients may be different for each macroblock. The entropy encoder component 120 encodes the residual block data using a Context Adaptive Variable Length Coding (CAVLC) scheme while other variable length coded units may be encoded using Exp-Golomb codes. Residual block data is the difference between the prediction and the original block pixel information being encoded. Above the slice layer, the syntax elements are encoded into fixed-length or variable-length binary codes. In slice layers and below, each element is encoded using a Variable Length Code (VLC). The h.264 standard also supports context-based adaptive binary arithmetic coding (CABAC) as an entropy coding scheme. The example methods discussed herein relate to CAVLC entropy coding schemes, but similar methods may also be used with CABAC entropy coding. The IS-RSP locator component 125 performs computations for identifying a set of macroblock boundaries within a slice that can be used as an IS-RSP. In one approach, a rate-distortion cost optimization analysis IS implemented in finding the best location of the IS-RSP. The memory component 130 is used to store information such as the original video data to be encoded, the encoded video data to be transmitted, or intermediate data being processed by the various encoder components.
The communication component 140 (e.g., receiver) contains circuitry and/or logic for receiving data to be encoded from an external source 145. External source 145 may be, for example, external memory, the internet, a live video and/or audio feed, and receiving data may include wired and/or wireless communication. Communications component 140 also contains circuitry and/or logic, such as a transmitter, for Transmitting (TX) encoded data over network 150. The network 150 may be part of a wired system such as telephone, cable, and fiber optic, or a wireless system. In the case of a wireless communication system, the network 150 may comprise, for example, a portion of a code division multiple access (CDMA or CDMA2000) communication system, or alternatively, the system may be: frequency Division Multiple Access (FDMA) systems; time Division Multiple Access (TDMA) systems, such as GSM/GPRS (general packet radio service)/EDGE (enhanced data GSM environment) or TETRA (terrestrial trunked radio) mobile phone technologies for the service industry; wideband Code Division Multiple Access (WCDMA); high data rate (1xEV-DO or 1xEV-DO gold multicast) systems; or, in general, any wireless communication system employing a combination of various techniques. One or more elements of encoder device 105 may be omitted, rearranged, and/or combined. For example, the processor component 135 may be located external to the encoder device 105.
Decoder device 110 contains similar components as encoder device 105, including inverse transformer/dequantizer component 155, entropy decoder component 160, error recovery component 165, memory component 170, communication component 175, and processor component 180. Decoder device 110 receives encoded data that has been transmitted over network 150 or from external memory 185. Communications component 175 contains circuitry and/or logic (e.g., a receiver) for receiving (Rx) encoded data in conjunction with network 150, as well as logic for receiving encoded data from external memory 185. The external memory 185 may be, for example, external RAM or ROM, or a remote server. The intra-coded data is first decoded by entropy decoding component 160. After entropy decoding, the data is dequantized and inverse transformed by inverse transformer/dequantizer component 155, resulting in a decoded picture that can be displayed on display component 190.
After decoding a reference frame from which inter-coded data is predicted, the inter-coded data may be decoded. Entropy decoder component 160 decodes the data, resulting in quantized/transformed residual error coefficients. Inverse transformer/dequantizer component 155 dequantizes and inverse transforms the residual error coefficients, resulting in decoded residual errors. The residual error is then combined with the best matching macroblock from the reference frame, which is identified using the received motion vector information. The decoded frames may be displayed by the display component 190 and stored in the external memory 185 or in an internal memory of the processor component 180. Display component 190 may be an integral part of a decoding device containing components such as video display hardware and logic (including a display screen), or it may be an external peripheral device. Communication component 175 also contains logic for transferring decoded frames to external storage component 185 or display component 190.
The entropy decoder component 160 also contains logic for performing various syntax and semantic checks. Syntax and semantic checks are used to identify corrupt codewords that violate any number of rules. If the bitstream IS determined to be corrupt, error recovery component 165 locates the next closest non-corrupt IS-RSP in the bitstream to allow decoding to continue. Details of the processes implemented by IS-RSP locator component 125 and error recovery component 165 are discussed below. One or more elements of encoder device 110 may be omitted, rearranged, and/or combined. For example, the processor component 180 may be located external to the encoder device 105.
The introduction of Context Adaptive Variable Length Coding (CAVLC) in h.264 complicates the resynchronization problem. In h.264, a number of symbols are encoded in context, and the probability of each symbol varies depending on the context in which it was encoded (i.e., what was processed before the symbol). Due to this context dependency, not only will there be a risk of losing synchronization by losing a symbol, but there will also be a risk of losing synchronization by losing other data used to determine the context in which the symbol was encoded. Thus, to provide a resynchronization point (IS-RSP) in a CAVLC bitstream, the decoder needs not only the position viewed in the bitstream, but it may also need context information on which subsequent symbols depend.
An example showing what information IS needed in IS-RSP to enable resynchronization in an h.264 context adaptive bitstream will now be discussed. Fig. 2 depicts an example of a 16 × 16 pixel macroblock and an adjacent 16 × 16 macroblock. The macroblock 200 contains 16 4 × 4 pixel blocks 201 to 216. In this example, assume that macroblock 200 is an intra-coded macroblock. The context dependency of intra-coded macroblocks differs from inter-coded macroblocks. The context adaptive coding for intra-coded macroblock 200 depends on the parameters of neighboring macroblocks 220 and 230. The encoding and decoding of the macroblock 200 depends on the number of non-zero coefficients (Y, Cr and Cb coefficients) in the neighboring four 4 x 4 pixel blocks 221-224 and in the neighboring four 4 x 4 pixel blocks 231-234. The encoding and decoding of macroblock 200 also depends on the quantization parameter values used to encode macroblock 220 and, in some cases, on the intra-prediction modes of blocks 221-224 and blocks 231-234. If the context parameters of the macroblock 220 or 230 are lost when decoding the CAVLC encoded symbols of block 200, the decoder will not be able to continue further decoding and resynchronization is needed. In h.264, the rest of the slice containing macroblock 200 may be lost due to resynchronization using NALU prefix code. However, if macroblock 200 IS a resynchronization point, decoding may continue regardless of the state of macroblocks 220 and 230, because IS-RSP contains the necessary context parameters for neighboring blocks, as described below.
The resynchronization point may be formed by encoding resynchronization point information including information for identifying a position of a resynchronization point within a section of the video bitstream, and encoding information for decoding the bitstream following the resynchronization point.
For example, to make the macroblock 200 a resynchronization point, the IS-RSP message may be encoded with information that the decoder uses to locate the IS-RSP flag 240, and thus the macroblock 200. To provide a robust resynchronization point, the IS-RSP message may contain neighboring MB information needed for a decoder to decode macroblock 200 (i.e., dependency information needed to decode macroblock 200), as well as information related to the location of the macroblock within the frame and information related to the location of the macroblock within the bitstream.
As an example, to use intra-coded macroblock 200 as a resynchronization point in, for example, h.264, an IS-RSP packet may contain the following h.264 specific parameters:
an MB bit offset amount that gives the position of the start of the IS-RSP macroblock within the bitstream.
An MB address offset or index identifying the spatial location of the IS-RSP macroblock.
The number of non-zero coefficients of the 84 x 4 block blocks located at the top left of the IS-RSP macroblock.
Intra prediction modes of the 84 × 4 blocks located at the top left of the IS-RSP macroblock. (if IS-RSP points to a macroblock encoded in intra 16 × 16 mode, the information may be omitted).
Quantization Parameter (QP) values for MBs located to the left of the IS-RSP macroblock. With the last three identifying context adaptation parameters as discussed above.
Inter-coded macroblocks to be used as resynchronization points may have the same amount of MB bit offset and MB address offset information, but may have different information related to neighboring macroblocks or other context adaptive dependencies. The context adaptation parameters of an inter-coded MB include the following information:
the modes of neighboring MBs (macroblocks at the top and bottom).
The number of non-zero coefficients in the neighboring 4 x 4 block.
Motion vectors and reference picture indices of neighboring 4 × 4 blocks.
In one example, the IS-RSP data may be sent in a "user data unregistered" auxiliary enhancement information (SEI) message, thereby maintaining the conformance of the h.264 bitstream. Although the above example IS specific to the h.264 standard, the underlying principles of IS-RSP are readily applicable to other hybrid video coding standards, such as MPEG-x and h.26x. This can indeed be done since the syntax of all these standards also somehow or otherwise provides conditions for carrying in-band application data/user-specific private data, e.g. SEI messages in h.264 or user data (user data) in MPEG-2. The IS-RSP data may also be encoded using compression encoding techniques to improve coding efficiency. For example, if the IS-RSP are spatially close to each other in a frame, the IS-RSP data may be encoded using a spatial prediction technique to improve coding efficiency.
The intra-slice resynchronization point or IS-RSP may be extended to video segments instead of slices. A slice, or what is referred to as a block group in the MPEG-x standard, is one of many partitioned sections in the h.264 or MPEG-x video sequence architecture. The h.264 architecture includes a sequence of one or more pictures (or frames), where a picture is composed of one or more slices and where a slice is composed of one or more macroblocks. The macroblock may be further partitioned into sub-macroblocks of various sizes. The resynchronization point may contain information that enables resynchronization on any layer of partitioned video (e.g., within sequence, within frame, and within slice). For example, if slice header information is corrupted, an RSP that is able to enable resynchronization at the corrupted slice will require the slice header information and the RSP will effectively be an intra RSP.
IS-RSP does not need to be sent for each macroblock because a significant amount of overhead data would be required. The encoder device may use an algorithm, such as one described below, to intelligently select where to position the IS-RSP. The algorithm described below is an example of a method for keeping overhead to a minimum while improving error resilience.
The channel impairments can be addressed by generating a resynchronization point through the encoding process and providing the resynchronization point to the decoding process (as often as possible according to the trade-offs to be described below). Likewise, the resynchronization point may also be uniquely identified in the bitstream. In addition, the structure provides a reasonably good level of immunity to noise-induced copying and noise-induced modifications to the extent that they are not reliably detectable (in a statistical sense). The resynchronization point may be used to prevent propagation of predictive coding dependencies by initializing all predictors to appropriate default values that can limit future spatial and/or temporal predictions from involving macroblocks prior to IS-RSP.
The comprehensive and correct design of the encoding process for identifying the resynchronization points within a slice may take into account: 1) a channel model, 2) bandwidth and quality (distortion) constraints for a particular application, and 3) the content of the frame.
Depending on the application, it may be desirable for content dependencies to exist. In one possible application, the absence of an on-chip resynchronization point for a particular slice may be exploited to imply to the decoder: in case of errors and consequent data loss in the slice, an appropriate (spatial or temporal) concealment algorithm known by both decoder and encoder can be implemented very satisfactorily.
The following provides an example IS-RSP encoding scheme incorporating a form of content adaptation. The scheme may also be used to illustrate the general architecture and principles to be employed by the encoder for identifying the location of the resynchronization point.
FIG. 3 illustrates an example of an example process for encoding IS-RSP. The process 300 includes an iterative loop for identifying an optimal or near optimal IS-RSP location from a plurality of candidate locations. This allows the location of the IS-RSP to be selected in a variable or adaptive manner. This variability or adaptivity provides an advantage over methods in which consecutive resynchronization points are located a fixed distance apart (e.g., one resynchronization point per 100 macroblocks). The process 300 may be implemented by an encoding device (e.g., IS-RSP location component 125) in conjunction with the processor 135 in fig. 1. At step 305, the encoder (or selector) selects a candidate location for IS-RSP. Details of the method of selecting IS-RSP candidate locations will be discussed below. At step 310, a metric, such as a rate-distortion cost, is calculated. The metric used in step 310, such as the rate-distortion cost discussed below, IS a measure of the trade-off between the number of bits added due to IS-RSP and the potential distortion caused by the expected error. The IS-RSP location IS selected 315 using a comparison of rate-distortion costs for each candidate location. The selected location (from cost calculation 310) may be stored in memory. Decision block 320 tests whether there are more candidate locations to test, if there are more candidate locations, steps 305, 310 and 315 are repeated, and if there are no more candidate locations, process 300 ends by encoding the IS-RSP information in step 325. One or more elements of process 300 may be omitted, rearranged, and/or combined.
Fig. 4 depicts a video frame in which error propagation IS controlled by utilizing IS-RSP. Frame 400 represents an intra-coded frame composed of a single slice and six rows 420a-420f, where rows 420a-420f are composed of macroblocks (not shown). The encoding process shown in fig. 3, for example, has located a plurality of IS-RSPs 405 in frame 400. As described above and depicted in fig. 2, the decoding of each intra-coded macroblock also depends on the knowledge contained in the left and upper neighboring macroblocks of the macroblock being decoded. If the IS-RSP 405 IS not included in the frame 400, the error 410 will propagate through the entire frame that IS decoded after experiencing the error. This will result in most and all of the second row 420b and rows 420c-420f being lost, since the decoder will be forced to resynchronize at the next slice (the next NALU prefix code). By locating the IS-RSP as shown, error propagation may be limited to the region 415 to the left and below the IS-RSP 405 b. After the resynchronization point IS-RSP 405b, in row 420b, the rest of the row may be decoded. In row 420c, macroblocks that are below the corrupt macroblock cannot be reassembled until the next resynchronization point 405c because they depend on neighboring macroblocks above. Since the IS-RSP 405c in row 420c IS to the left of the IS-RSP 405b in row 420b, the error 415 in row 420c IS contained earlier. This is a far more favorable distortion condition than that provided by the slice resynchronization NALU.
FIG. 5 IS an example of an IS-RSP encoding method for identifying candidate IS-RSP locations. The frame 500 is made up of several rows 505, where each row 505 also constitutes a slice. In this example, the encoder would provide two IS-RSPs in each slice, each IS-RSP also located within a narrow macroblock region. The first region includes three macroblock locations 510, 515, and 520, and the second region includes three macroblock locations 525, 530, and 535. By locating the slice in the centrally located slice portion as shown, propagation of errors can be mitigated.
The rate-distortion (R-D) optimization problem described below may utilize the following nomenclature, magnitudes, and notation. Assume the general (composite) form R + D of the R-D cost function, which is found by
R+λD [1]
As a quantity to be minimized when finding a good location for IS-RSP.
R and D represent rate (the number of bits required to encode IS-RSP) and distortion associated with selecting a particular coding mode (re-binning error), respectively, and a IS a Lagrange multiplier that describes a trade-off between rate and distortion that will be experienced in a particular R-D optimization setting. Thus, the cost function estimates a trade-off between the additional cost incurred to encode the resynchronization point and the savings from distortion reduction.
One standard R-D optimization implemented in hybrid video encoders is one used when determining macroblock coding modes. In general, this is a vector quantity with multiple parameters, assuming that the resynchronization point does not impose any constraints. Solving the standard R-D optimization formula for the encoder for macroblocks immediately to the right of position i (where i e {510, 515, 520} or i e {525, 530, 535} results in a rate R, respectivelyS,iAnd distortion DS,i. Thus, assuming that there is no resynchronization point insertion and no data loss, the total cost is found by:
RS,i+λDS,i [2]
for unreliable channels, equation 2 (assuming no data loss) is inappropriate and must be modified. Any appropriate formula must also take into account random channel behavior, and thus, the total cost may become an expected quantity (probability average) rather than a deterministic quantity as calculated in formula [2] above.
A more realistic loss-aware analysis may correlate expected costs due to errors (i.e., no resynchronization points are inserted at location i but there may be losses) while selecting a solution to the standard R-D optimization problem. The probability of assuming a slice error can be determined and is given by peAnd (4) showing. It is also assumed that when a slice is in error, it has been corrupted at a single point and over a range of bits. By using appropriate (data/symbol) interleaving techniques, the two assumptions can be made to be very accurate in practice, especially in the case of digital wireless communications. As a final simplification, it will be assumed that when a slice has errors, the exact error locations within the slice are evenly distributed with respect to the macroblock units. A more accurate model may also take into account the size (number of bits) of the coded representation of each macroblock in the slice. From the above assumptions, the channel-aware expected total cost associated with no resynchronization point insertion can be modified as follows:
RS,i+λ((1-pe)·DS,i+pe·(pi·(DL,a+DL,b))) [3]
wherein in the formula [3]The newly introduced parameters are depicted in fig. 6. Fig. 6 depicts an example of error probabilities and associated distortions used in a rate-distortion analysis for identifying the best candidate IS-RSP location. Conditional probability pi(610) Is the presence of an error in the slice (where p iseIs the probability of error occurring) before position "i". DL,b(620) IS the distortion associated with the slice data lost immediately prior to the candidate IS-RSP to be inserted due to the error 615. DL,aIS the distortion associated with the data lost after the candidate IS-RSP to be inserted.
The probability that a slice is corrupted after the candidate resynchronization point considered for position "i" (605) does not facilitate insertion of the candidate resynchronization point because it would be unprofitable. Therefore, to avoid the occurrence of unfair bias, there is no distortion component in equation [3] due to the loss of slice data due to possible errors occurring to the right of position "i" (605).
The presence of an on-chip resynchronization point at position "i" (605) does not avoid slice data loss between the erroneous position (610) and position "i" (605). Thus, in case an error occurs before the candidate resynchronization point, a distortion D that may be caused by the loss of the section slice dataL,b(620) Is determined. Thus, DL,b(620) Included in the expected total cost function used to estimate whether to insert the resynchronization point at "i" (605).
Quantity DL,a(625) Distortion considerations that will reduce the impact of the appropriate spatial or temporal concealment algorithm available to the decoder. It would be interesting and helpful to consider both the extremes of "not concealed" and "completely concealed", as will be discussed below.
Ideally, the quantity D is such that if there is a priori information about its future useL,a(625) Also reflecting distortions that would reduce the impact of subsequent IS-RSP in the same slice.
The next step IS to calculate (using a rate-distortion calculator component in one example) the expected "channel-aware" R-D cost associated with actually inserting the IS-RSP at position "i" (605). The encoder, when processing the macroblock immediately after (i.e. to the right of) the resynchronization point, has to respect the coding constraints (e.g. initialize all predictors to appropriate default values) that support the implementation of the resynchronization as described above. Assuming that, according to these constraints, the encoder has formulated and solved an R-D optimization problem (very similar to the standard R-D optimization problem described above) of the form:
(RR,i+RO,i)+λDR,i [4]
in the formula [4]In, RR,iAnd DR,iRespectively, the rate and distortion (re-binning errors) associated with the constrained coding mode selected for the re-synchronization point macroblock, and RO,iIS the rate associated with the required overhead information to be included in the bitstream (e.g., IS-RSP information included in SEI messages). As discussed above, equation [4 ]]The R-D cost expression in (a) is not channel-aware and can be modified to introduce channel-awareness as follows:
(RR,i+RO,i)+λ((1-pe)·DR,i+pe·(pi·(DL,a+DL,b))) [5]
since decoding can be restarted at position "i" (605), the distortion D can be reducedL,a(625) Is set to 0. The resulting "channel-aware" expected R-D cost associated with inserting the resynchronization point at position "i" is found by:
(RR,i+RO,i)+λ((1-pe)·DR,i+pe·(pi·DL,b)) [6]
thus, the overall algorithm for identifying good macroblock locations to locate IS-RSP IS to compare the R-D cost found from equation [3] without IS-RSP (see equation [7] below) with the R-D cost found from equation [6] with the inclusion of IS-RSP (see equation [8] below). The encoder can compute the following two quantities:
CNo_RSP,i=RS,i+λ((1-pe)·DS,i+pe·(pi·(DL,a+DL,b))) [7]
CRSP,i=(RR,i+RO,i)+λ((1-pe)·DR,i+pe·(pi·DL,b)) [8]
equation [7] may be computed for all candidate positions for locating IS-RSP, including + -i e (510, 515, 520) or + -i e (525, 530, 535) in FIG. 5]And [8]]. If inequality C is satisfied in at least one of the candidate locationsRSP,i<CNo_RSP,iThe encoder decides to insert IS-RSP. And if the inequality C is satisfied at more than one candidate positionRSP,i<CNo_RSP,iThen the encoder must make a selection (as shown in step 315 in fig. 3) that may be beneficial to get the minimum CRSP,iThe location of the value. This enables content adaptation to some extent, since the coding constraints associated with the resynchronization point may be of much less importance when the resynchronization point is aligned with an appropriate image shape (e.g., a vertical edge across which horizontal prediction from a neighbor to the left of the vertical edge has not provided satisfactory performance and is therefore undesirable). In these cases, RR,iAnd DR,iMay be very close to R respectivelyS,iAnd DS,iWill reduce the efficiency advantage of coding with full flexibility and enlarge the distortion component (p)e.pi.DL,a) C caused byRSP,iAnd CNo_RSP,iThereby increasing the likelihood of IS-RSP insertion.
An interesting situation arises if one considers an application and a matching encoder design that reflects significant distortion concerns (i.e., lambda values whose boundaries are assumed to be far from zero). Also assume p of the channeleAnd is not negligible. In the extreme case of "no concealment" at the decoder, the distortion component D, which represents the total signal lossL,aMay be very large, so that term (p)e.pi.DL,a) Become dominant and make CNo_RSP,i(represented by the formula [7]]Find) is greater than CRSP,i(represented by the formula [8]]Solved) and thus suggests inserting a resynchronization point. In the other extreme, if "full concealment" is assumed (which can be well approximated by a successful concealment algorithm and suitable image content), all due to slice data lossDistortion component (D) ofL,*) Can also disappear, leaving the following costs (from equation [7], respectively)]And [8]]To give:
CNo_RSP,i=RS,i+λ((1-pe)·DS,i) [9]
CRSP,i=(RR,i+RO,i)+λ((1-pe)·DR,i) [10]
an additional rate R allocated for overhead information for identifying IS-RSP IS requiredO,iAnd the sub-optimal performance of the code executed according to the constraints imposed by the resynchronization point, such that CRSP,i>CNo_RSP,iThis means that there is no need to use a resynchronization point. For other cases in between the two cases, equation [7]]And [8]]A simple but useful framework is provided for locating the on-chip resynchronization point.
Fig. 7 IS an illustration of an example of a decoder process utilizing IS-RSP. Process 700 may be implemented by a decoder device, such as decoder device 110 in fig. 1. The decoder device first receives an encoded video bitstream encoded using IS-RSP in the manner as described in the example above at step 705. A receiving device (e.g., communication component 175 in fig. 1) may perform the operation 705. The decoder device decodes the received video bitstream at step 710 using the methods used in encoding the bitstream as outlined in standards such as h.26x and MPEG-x. A decoding device, such as inverse transformer/dequantizer component 155 in fig. 1, may perform operation 710. The h.26x and MPEG-x standards also clearly illustrate the semantic and syntax criteria that must be followed by standards compliant encoders and decoders. The decoder compares the value of the decoded variable to a range of values set by semantic and syntax rules at step 715 to identify whether the bitstream is corrupt. Comparison means, such as entropy decoding device 160 and error recovery component 165 in fig. 1, may perform the semantic and syntax comparison checks of step 715. Semantic comparison checks include, but are not limited to, the following:
NAL unit header byte syntax element.
SPS (sequence parameter set) syntax elements.
PPS (picture parameter set) syntax element.
Slice header syntax element.
Slice layer access unit descriptor.
The total number of macroblocks decoded.
Macroblock layer syntax elements.
Availability flag (in a context dependent manner).
Insufficient data in the bitstream buffer for continuous decoding.
Comparison of externally provided reference frame buffers (size and number) with the current (i.e., valid) SPS implications.
The syntax comparison check includes a decoding failure associated with decoding the CAVLC entropy codeword. If an erroneous bit is detected against any semantic or grammatical rules or in any other way, step 720 detects these failures and initiates resynchronization. If the rules are not violated at step 720, step 725 checks if there is more data to decode. Decoding device (e.g., error recovery component 165 of fig. 1) performs the operations of steps 720 and 725. If there is more data, steps 705 to 725 (and possibly 730) continue until there is no more data and the process ends.
If a bit error is detected, resynchronization is performed. To perform resynchronization, the decoder locates the next non-corrupt IS-RSP in the bitstream and resumes decoding at step 730, repeating steps 705 through 730 from the located next non-corrupt IS-RSP. The locating device (e.g., error recovery component 165 of FIG. 1) performs the locating operation of step 730. The decoder may have received IS-RSP information in the bitstream in step 705. In one example, the IS-RSP information may be included in an in-band application message or a user-specific private data message (e.g., SEI message in h.264 or user data message in MPEG-2). As discussed above, the message contains information related to the location of the IS-RSP macroblock within the video frame, as well as information related to the location of the IS-RSP within the bitstream. As discussed above, the message also contains the data needed for the decoder to continue decoding (step 710) at the IS-RSP identified macroblock location. One or more elements of process 700 may be omitted, rearranged and/or combined.
Fig. 8 IS an illustration of another example of a decoder process utilizing IS-RSP. Process 800 contains similar steps as process 700 of fig. 7, but IS-RSP IS used to identify a previously undetected corrupted bitstream in a manner other than detecting corruption in the bitstream and then locating the IS-RSP as shown in fig. 7. Steps 805, 810, and 820 are similar to steps 705, 710, and 725, respectively, of fig. 7. In decoding the bitstream, the decoder may encounter a point that has been identified as IS-RSP in the IS-RSP information message. When the decoder locates the IS-RSP in this manner at step 815, it compares 825 the context of the current bitstream to the context information contained in the IS-RSP message. If the contexts are not consistent, a comparison check (not shown) of the range of values set by semantic and syntax rules IS performed on both the current data in the bitstream and the data included in the IS-RSP. By comparing the data values included in the current bitstream and the data included in the IS-RSP with the range of values allowed by the semantic and syntax rules, the decoder can determine whether the current bitstream and/or the IS-RSP data are corrupt. If the bitstream IS found corrupt at step 830, the decoder may stop its current decoding and resume decoding at step 835 at the IS-RSP it just located, and proceed to step 805. If instead it IS determined that IS-RSP, rather than the bitstream, IS corrupt, the process continues at step 805. If it IS determined that both the bitstream and the IS-RSP are corrupt (not shown), the next IS-RSP may be located and the process may continue at 805. Furthermore, if the context comparison between the current context and the context included in the IS-RSP shows agreement, no corruption IS identified and the process continues at step 805. Comparing means, such as entropy decoding component 160 and error recovery component 165 in fig. 1, may perform the context and/or semantic and syntax comparison checks of step 825. One or more elements of the process 800 may be omitted, rearranged and/or combined.
The examples described above utilize 16 x 16 pixel macroblocks for IS-RSP. Other block sizes may also be utilized for the IS-RSP method discussed above, as known to those skilled in the art.
Fig. 9 illustrates an example method 900 for video encoding. The method 900 includes: encoding 910 resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and transmitting 920 the encoded resynchronization point information. Fig. 10 illustrates an example apparatus 1000 for video encoding. The apparatus 1000 comprises: an encoding module 1010 configured to encode resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point within a section of a video bitstream, and information for decoding the bitstream following the resynchronization point; and a transmission module 1020 configured to transmit the encoded resynchronization point information.
Fig. 11 illustrates an example method 1100 for decoding video data. The method 1100 comprises: receiving 1100 an encoded bitstream comprising resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point and information for decoding a bitstream following the resynchronization point; and decoding 1120 the received bitstream. Fig. 12 illustrates an example apparatus 1200 that decodes video. The apparatus 1200 comprises: a receiver module 1210 configured to receive an encoded bitstream comprising resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point; and a decoding module configured to decode the received bitstream.
Fig. 13 illustrates an example method 1300 for encoding multimedia data. The method 1300 includes: encoding 1310 resynchronization point data; and inserting 1320 the resynchronization point data into the multimedia stream slice. The method may further comprise selecting a location of a resynchronization point within the slice; and wherein inserting comprises inserting the resynchronization point in the selected location. The selecting may include: calculating rate-distortion costs for a plurality of candidate locations; and selecting at least one candidate location based on the rate-distortion cost. Fig. 14 illustrates an example apparatus 1400 for performing encoding of multimedia data. The apparatus 1400 comprises: an encoding module 1410 configured to encode resynchronization point data; and an insertion module 1420 configured to insert the resynchronization point data into a multimedia stream slice.
Fig. 15 illustrates an example method 1500 for processing a multimedia stream. The method 1500 includes: receiving 1510 resynchronization point data in the multimedia stream slice; and reassembling 1520 multimedia data according to the resynchronization point data. Fig. 16 illustrates an example apparatus 1600 for processing a multimedia stream. The apparatus 1600 includes: a receiving module configured to receive resynchronization point data in a multimedia stream slice; and the restructuring module restructures the multimedia data according to the resynchronization point data.
The examples described above use video only as an example. The methods and apparatus described above may also be used for other streaming data forms, including audio, graphics, images, text, and combinations thereof.
Those of skill in the art would understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
Those of skill would further appreciate that the various illustrative logical blocks, modules, and algorithm steps described in connection with the examples disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed methods.
The various illustrative logical blocks, modules, and circuits described in connection with the examples disclosed herein may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware elements, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The steps of a method or algorithm described in connection with the examples disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuit (ASIC). The ASIC may reside in a wireless modem. In the alternative, the processor and the storage medium may reside as discrete components in a wireless modem.
The previous description of the disclosed examples is provided to enable any person skilled in the art to make or use the disclosed methods and apparatus. Various modifications to these examples will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other examples without departing from the spirit or scope of the disclosed methods and apparatus.
Thus, various methods and apparatus have been described for providing a resynchronization point within a section of a video bitstream to enable a decoder to locate the resynchronization point and decode the bitstream following the resynchronization point.
Claims (106)
1. A method of video encoding, comprising:
encoding resynchronization point information, wherein the resynchronization point information comprises information identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and
transmitting the encoded resynchronization point information.
2. The method of claim 1, further comprising:
calculating rate-distortion costs for a plurality of candidate locations; and
selecting at least one of the candidate locations as a location of the resynchronization point according to the calculated rate-distortion cost.
3. The method of claim 1, further comprising selecting a location of the resynchronization point within a section of the video bitstream, wherein the section is a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
4. The method of claim 1, further comprising selecting a location of the resynchronization point within a section of the video bitstream, wherein the resynchronization point is a start of a macroblock.
5. The method of claim 1, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
6. The method of claim 1, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
7. The method of claim 1, further comprising encoding the resynchronization point information in a data message, wherein the data message is one of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message.
8. The method of claim 1, wherein the information for decoding the bitstream comprises information related to a context in which the bitstream follows the resynchronization point.
9. An apparatus for video encoding, comprising:
encoding means for encoding resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and
means for transmitting the encoded resynchronization point information.
10. The apparatus of claim 9, further comprising:
computing means for computing rate-distortion costs for a plurality of candidate locations; and
means for selecting at least one of the candidate locations as a location of the resynchronization point according to the calculated rate distortion cost.
11. The apparatus of claim 9, further comprising means for selecting a location of the resynchronization point within a section of the video bitstream, wherein the section is a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
12. The apparatus of claim 9, further comprising means for selecting a location of the resynchronization point within a section of the video bitstream, wherein the resynchronization point is a start of a macroblock.
13. The apparatus of claim 9, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
14. The apparatus of claim 9, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
15. The apparatus of claim 9, further comprising means for encoding the resynchronization point information in a data message, wherein the data message is one of an in-band application message, a user specific private data message, a secondary enhancement information message, and an MPEG user data message.
16. The apparatus of claim 9, wherein the information for decoding the bitstream comprises information related to a context in which the bitstream follows the resynchronization point.
17. A processor for video encoding, the processor configured to control:
encoding resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and
transmitting the encoded resynchronization point information.
18. The processor of claim 17, further configured to control:
calculating rate-distortion costs for a plurality of candidate locations; and
selecting at least one of the candidate locations as a location of the resynchronization point according to the calculated rate-distortion cost.
19. The processor of claim 17, further configured to control selection of a location of the resynchronization point within a section of the video bitstream, wherein the section is a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
20. The processor of claim 17, further configured to control selecting a location of the resynchronization point within a section of the video bitstream, wherein the resynchronization point is a start of a macroblock.
21. The processor of claim 17, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
22. The processor of claim 17, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
23. The processor of claim 17 further configured to control encoding the resynchronization point information in a data message, wherein the data message is one of an in-band application message, a user specific private data message, an assistance enhancement information message, and an MPEG user data message.
24. The processor of claim 17, wherein the information for decoding the bitstream comprises information related to a context in which the bitstream follows the resynchronization point.
25. An apparatus for video encoding, comprising:
an encoder for encoding resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and
a communicator configured to transmit the encoded resynchronization point information.
26. The apparatus of claim 25, further comprising a selector to select a location of the resynchronization point within a section of the video bitstream, wherein the section is a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
27. The apparatus of claim 25, further comprising a selector to select a location of the resynchronization point within a section of the video bitstream, wherein the resynchronization point is a start of a macroblock.
28. The apparatus of claim 25, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
29. The apparatus of claim 25, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
30. The apparatus of claim 25, wherein the encoder encodes the resynchronization point information in a data message, wherein the data message is a member of the group consisting of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message.
31. The apparatus of claim 25, wherein the encoder encodes the resynchronization point information using variable length coding.
32. The apparatus of claim 25, further comprising:
a selector; and
a calculator for calculating rate-distortion costs for a plurality of candidate locations;
wherein the selector selects at least one of the candidate locations as the location of the resynchronization point according to the calculated rate-distortion cost.
33. A computer-readable medium embodying a method for video encoding, the method comprising:
encoding resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point within a section of a video bitstream and information for decoding the bitstream following the resynchronization point; and
transmitting the encoded resynchronization point information.
34. The computer-readable medium of claim 33, wherein the method further comprises:
calculating rate-distortion costs for a plurality of candidate locations; and
selecting at least one of the candidate locations as a location of the resynchronization point according to the calculated rate-distortion cost.
35. The computer-readable medium of claim 33, wherein the method further comprises selecting a location of the resynchronization point within a section of the video bitstream, wherein the section is a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
36. The computer-readable medium of claim 33, wherein the method further comprises selecting a location of the resynchronization point within a section of the video bitstream, wherein the resynchronization point is a start of a macroblock.
37. The computer-readable medium of claim 33, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
38. The computer-readable medium of claim 33, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
39. The computer-readable medium of claim 33, wherein the method further comprises encoding the resynchronization point information in a data message, wherein the data message is one of an in-band application message, a user specific private data message, an assistance enhancement information message, and an MPEG user data message.
40. The computer-readable medium of claim 33, wherein the information for decoding the bitstream comprises information related to a context in which the bitstream follows the resynchronization point.
41. A method for decoding video data, comprising:
receiving an encoded bitstream including resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point; and
decoding the received bitstream.
42. The method of claim 41, wherein the method further comprises:
locating a resynchronization point in the bitstream according to the resynchronization point information.
43. The method of claim 41, wherein the information for decoding the bitstream comprises information related to a context in which the bitstream following the resynchronization point is encoded.
44. The method of claim 43, wherein the method further comprises:
comparing a current context of the decoded bitstream with received context information included in the resynchronization point information; and
stopping decoding the bitstream and resuming decoding the bitstream at the resynchronization point if the comparing shows that the current context is not the same as the received context information.
45. The method of claim 41, wherein the location of the resynchronization point is within a video segment selected from a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
46. The method of claim 41, wherein the location of the resynchronization point is a start of a macroblock.
47. The method of claim 41, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
48. The method of claim 41, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
49. The method of claim 41, wherein the method further comprises:
receiving the resynchronization point information in a data message, wherein the data message is a member of a group consisting of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message.
50. The method of claim 41, wherein the method further comprises:
resynchronization point information encoded using a variable length code is received.
51. The method of claim 41, wherein the method further comprises:
detecting an error in the bitstream;
stopping decoding of the bitstream; and
decoding continues at the located resynchronization point.
52. An apparatus for decoding video, comprising:
receiving means for receiving an encoded bitstream comprising resynchronization point information, wherein the resynchronization point information comprises information for identifying a location of a resynchronization point and information for decoding a bitstream following the resynchronization point; and
decoding means for decoding the received bitstream.
53. The apparatus of claim 52, further comprising:
means for locating a resynchronization point in the bitstream according to the resynchronization point information.
54. The apparatus of claim 52, wherein the information for decoding the bitstream comprises information related to a context in which a bitstream following the resynchronization point is encoded.
55. The apparatus of claim 54, further comprising:
means for comparing a current context of the decoded bitstream with received context information included in the resynchronization point information; and
stopping means for stopping decoding the bitstream and resuming decoding the bitstream at the resynchronization point when the comparing shows that the current context is not the same as the received context information.
56. The apparatus of claim 52, wherein the location of the resynchronization point is within a video segment selected from a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
57. The apparatus of claim 52, wherein the location of the resynchronization point is a start of a macroblock.
58. The apparatus of claim 52, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
59. The apparatus of claim 52, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
60. The apparatus of claim 52, further comprising:
receiving means for receiving resynchronization point information in a data message, wherein the data message is a member of the group consisting of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message.
61. The apparatus of claim 52, further comprising:
receiving means for receiving resynchronization point information encoded using a variable length code.
62. The apparatus of claim 52, further comprising:
detecting means for detecting an error in the bitstream;
stopping means for stopping decoding of the bitstream; and
means for resuming decoding at the located resynchronization point.
63. A processor for decoding video, the processor configured to control:
receiving an encoded bitstream including resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point; and
decoding the received bitstream.
64. The processor of claim 63, wherein the processor is further configured to control:
locating a resynchronization point in the bitstream according to the resynchronization point information.
65. The processor of claim 63, wherein the information for decoding the bitstream comprises information related to a context in which a bitstream following the resynchronization point is encoded.
66. The processor of claim 64, wherein the processor is further configured to control:
comparing a current context of the decoded bitstream with received context information included in the resynchronization point information; and
stopping decoding the bitstream and resuming decoding the bitstream at the resynchronization point if the comparing shows that the current context is not the same as the received context information.
67. The processor of claim 63, wherein the location of the resynchronization point is within a video segment selected from a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
68. The processor of claim 63, wherein the location of the resynchronization point is a start of a macroblock.
69. The processor of claim 63, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
70. The processor of claim 63, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
71. The processor of claim 63, wherein the processor is further configured to control:
re-synchronization point information in a data message is received, wherein the data message is a member of a group consisting of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message.
72. The processor of claim 63, wherein the method further comprises:
resynchronization point information encoded using a variable length code is received.
73. The processor of claim 63, wherein the method further comprises:
detecting an error in the bitstream;
stopping decoding of the bitstream; and
decoding continues at the located resynchronization point.
74. An apparatus for decoding video, comprising:
a receiver for receiving an encoded bitstream including resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point; and
a decoder for decoding the received bitstream.
75. The apparatus of claim 74, wherein the decoder is further configured to locate a resynchronization point in the bitstream according to the resynchronization point information.
76. The apparatus of claim 74, wherein the information for decoding the bitstream comprises information related to a context in which a bitstream following the resynchronization point is encoded.
77. The apparatus of claim 76, wherein the decoder is further configured to:
comparing a current context of the decoded bitstream with received context information included in the resynchronization point information; and
stopping decoding the bitstream and resuming decoding the bitstream at the resynchronization point if the comparing shows that the current context is not the same as the received context information.
78. The apparatus of claim 74, wherein the location of the resynchronization point is within a video segment selected from a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
79. The apparatus of claim 74, wherein the location of the resynchronization point is a start of a macroblock.
80. The apparatus of claim 74, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
81. The apparatus of claim 74, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
82. The apparatus of claim 74, wherein the receiver is further configured to receive resynchronization point information in a data message, wherein the data message is a member of the group consisting of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message.
83. The apparatus of claim 74, wherein the decoder is further configured to:
detecting an error in the bitstream;
stopping decoding of the bitstream; and
decoding continues at the located resynchronization point.
84. A computer-readable medium embodying a method for decoding video, the method comprising:
receiving an encoded bitstream including resynchronization point information, wherein the resynchronization point information includes information for identifying a location of a resynchronization point and information for decoding the bitstream following the resynchronization point; and
decoding the received bitstream.
85. The computer-readable medium of claim 84, wherein the method further comprises:
locating a resynchronization point in the bitstream according to the resynchronization point information.
86. The computer-readable medium of claim 84, wherein the information for decoding the bitstream comprises information relating to a context in which a bitstream following the resynchronization point is encoded.
87. The computer-readable medium of claim 86, wherein the method further comprises:
comparing a current context of the decoded bitstream with received context information included in the resynchronization point information; and
stopping decoding the bitstream and resuming decoding the bitstream at the resynchronization point if the comparing shows that the current context is not the same as the received context information.
88. The computer-readable medium of claim 84, wherein the location of the resynchronization point is within a video segment selected from a member of the group consisting of a sub-macroblock, a slice, a frame, and a sequence of frames.
89. The computer-readable medium of claim 84, wherein the location of the resynchronization point is a start of a macroblock.
90. The computer-readable medium of claim 84, wherein the information for decoding the bitstream comprises information related to adjacent video segments.
91. The computer-readable medium of claim 84, wherein the information for decoding the bitstream comprises a member of the group consisting of a quantization parameter, a spatial prediction mode identifier, and a number of non-zero coefficients.
92. The computer-readable medium of claim 84, wherein the method further comprises:
re-synchronization point information in a data message is received, wherein the data message is a member of a group consisting of an in-band application message, a user-specific private data message, a secondary enhancement information message, and an MPEG user data message.
93. The computer-readable medium of claim 84, wherein the method further comprises:
resynchronization point information encoded using a variable length code is received.
94. The computer-readable medium of claim 84, wherein the method further comprises:
detecting an error in the bitstream;
stopping decoding of the bitstream; and
decoding continues at the located resynchronization point.
95. A method for encoding multimedia data, comprising:
encoding the resynchronization point data; and
inserting the resynchronization point data into a slice of the multimedia stream.
96. The method of claim 95, further comprising:
selecting a location of a resynchronization point within the slice; and wherein the inserting comprises:
inserting the resynchronization point into the selected location.
97. The method of claim 96, further comprising:
calculating rate-distortion costs for a plurality of candidate locations; and
at least one candidate location is selected based on the rate-distortion cost.
98. The method of claim 95, wherein the resynchronization point comprises context information for the multimedia data.
99. An apparatus for encoding multimedia data, comprising:
encoding means for encoding resynchronization point data; and
inserting means for inserting the resynchronization point data into a slice of a multimedia stream.
100. The apparatus of claim 99, further comprising:
selecting means for selecting a location of a resynchronization point within the slice; and wherein the inserting comprises:
inserting means for inserting the resynchronization point into the selected location.
101. The apparatus of claim 100, wherein the means for selecting comprises:
computing means for computing rate-distortion costs for a plurality of candidate locations; and
selecting means for selecting at least one candidate location according to the rate-distortion cost.
102. The apparatus of claim 99, wherein the resynchronization point comprises context information for the multimedia data.
103. A method for processing a multimedia stream, comprising:
receiving resynchronization point data in a slice of the multimedia stream; and
and reconstructing the multimedia data according to the resynchronization point data.
104. The method of claim 103, wherein the resynchronization point comprises context information for the multimedia data.
105. An apparatus for processing a multimedia stream, comprising:
receiving means for receiving resynchronization point data in a slice of the multimedia stream; and
reassembling means for reassembling the multimedia data according to the resynchronization point data.
106. The apparatus of claim 105, wherein the resynchronization point comprises context information for the multimedia data.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US60/660,879 | 2005-03-10 | ||
| US60/713,207 | 2005-09-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| HK1113886A true HK1113886A (en) | 2008-10-17 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7929776B2 (en) | Method and apparatus for error recovery using intra-slice resynchronization points | |
| AU774040B2 (en) | Video coding | |
| CN1258928C (en) | Error concealment method and encoder for improved error concealment in video coding | |
| CN1192635C (en) | Method and device for video encoding and decoding | |
| US10425661B2 (en) | Method for protecting a video frame sequence against packet loss | |
| Superiori et al. | Performance of a H. 264/AVC error detection algorithm based on syntax analysis | |
| US20050089102A1 (en) | Video processing | |
| Zheng et al. | Error-resilient coding of H. 264 based on periodic macroblock | |
| EP1345451A1 (en) | Video processing | |
| HK1113886A (en) | Method and apparatus for error recovery using intra-slice resynchronization points | |
| Aladrovic et al. | An error resilience scheme for layered video coding | |
| EP1349398A1 (en) | Video processing | |
| Rhaiem et al. | New robust decoding scheme-aware channel condition for video streaming transmission |