HK1131298B - A method and system for processing video signal - Google Patents
A method and system for processing video signal Download PDFInfo
- Publication number
- HK1131298B HK1131298B HK09108970.8A HK09108970A HK1131298B HK 1131298 B HK1131298 B HK 1131298B HK 09108970 A HK09108970 A HK 09108970A HK 1131298 B HK1131298 B HK 1131298B
- Authority
- HK
- Hong Kong
- Prior art keywords
- picture
- pictures
- sequence
- forward reference
- invalid
- Prior art date
Links
Description
Technical Field
The present invention relates to video data processing, and more particularly, to a method and system for processing B pictures with missing or invalid forward references.
Background
Various existing compression methods, including AVS1-P2, generate data for a current video image that indicates a difference between the current video image and a reference video image. The AVS1-P2 is a video standard established by the Chinese audio and video coding standard working group. The workgroup was authorized by the department of science and technology under the department of national information industry in 2002 in 6 months. The main role of this group is to establish (or edit) such common technical standards for compression, decoding, processing and characterization of digital audiovisual data. The standard can be used for high precision digital broadcasting, high density laser digital storage media, wireless broadband multimedia communication and internet broadband streaming media.
There are three basic image formats in the AVS1-P2 standard: intra (I) pictures, predictive (P) pictures, and bi-predictive (B) pictures. This classification is I, P in the early standards and a continuation of the B-picture concept in functionality. P-picture coding uses forward pictures for prediction, while B-picture coding uses forward or backward or bi-directional prediction.
Similar to earlier standards, the AVS1-P2 also utilized sequence headers as Random Access Points (RAPs) for features such as channel switching. After the sequence header, a P picture can only refer to pictures after the sequence header, while a B picture can refer to pictures before the sequence header. The video image used for reference may be a forward video image and/or a backward video image of the current image. However, if the forward picture used for reference is corrupted, the decoding operation for the current picture will not be possible.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.
Disclosure of Invention
A method and/or system for processing a B picture with a missing or invalid forward reference picture, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.
According to an aspect of the present invention, there is provided a video signal processing method, the method including:
after receiving a sequence header in compressed video data, judging whether a continuous adjacent B image sequence is closely followed by a first I image in the compressed video data;
and decoding the adjacent B picture sequence according to the judgment, wherein the decoding operation carries out B picture discarding or corresponding interpolation processing according to a video editing code and/or a random access point, and the video editing code is used for indicating that a continuous B picture which is next to an I picture possibly has invalid or missing forward reference pictures.
Preferably, the method further comprises: discarding each B picture in the adjacent sequence of B pictures determined to immediately follow the first I picture.
Preferably, the method further comprises: discarding each B picture in the neighboring sequence of B pictures when a forward reference picture referenced by each B picture in the neighboring sequence of B pictures is invalid or missing.
Preferably, the method further comprises: interpolating a decoded picture corresponding to each B picture in the adjacent sequence of B pictures when a forward reference picture referred to by each B picture in the adjacent sequence of B pictures is invalid or missing.
Preferably, when the forward reference picture is invalid but available, the interpolation operation is performed using the following algorithm:
(m +1-n)/(m +1)) + (n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
Preferably, when the forward reference image is missing, the interpolation operation is performed using the following algorithm:
(n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
According to an aspect of the present invention, there is provided a video signal processing method, the method including:
transitioning from the first state to the second state upon detecting a video editing code or random access indication in AVS1-P2 format in the first state;
detecting one of the I picture, the P picture, the B picture or the video editing code in the second state, and then converting to an error state;
transitioning to a third state upon detection of a sequence header in the second state following detection of a video editing code or random access indication in the AVS1-P2 format in the first state;
after one of a P image, a B image, a video editing code or a sequence header is detected in the third state, the third state is switched to an error state;
after detecting the I image in the third state, switching to a fourth state;
after detecting the I image or the P image in the fourth state, switching to the first state;
switching to a fifth state after detecting a B picture referencing a forward reference picture in the fourth state;
switching to the first state after detecting one of the I picture or the P picture in the fifth state;
and after detecting a B picture that does not refer to the forward reference picture in the fifth state, transitioning to a fourth state.
Preferably, the method further comprises: discarding each B-image detected in the fourth and fifth states.
Preferably, the method further comprises: discarding B pictures of the reference forward reference picture detected in the fifth state.
Preferably, the method further comprises: interpolating a decoded picture corresponding to the B picture detected in the fourth or fifth state when the forward reference picture is invalid or missing.
Preferably, when the forward reference picture is invalid but available, the interpolation operation is performed using the following algorithm:
(m +1-n)/(m +1)) + (n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
Preferably, when the forward reference image is missing, the interpolation operation is performed using the following algorithm:
(n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
According to one aspect of the present invention, there is provided a video signal processing system, comprising:
one or more circuits configured to determine whether a first I picture in the compressed video data is immediately followed by a sequence of consecutive adjacent B pictures, and decode the sequence of consecutive B pictures according to the determination, wherein the decoding operation performs B picture dropping or corresponding interpolation processing according to a video editing code and/or a random access point, the video editing code indicating that the consecutive B pictures immediately following the I picture may have invalid or missing forward reference pictures.
Preferably, the one or more circuits discard each B picture in the adjacent sequence of B pictures determined to immediately follow the first I picture.
Preferably, the one or more circuits are configured to discard each B picture in the neighboring sequence of B pictures when the forward reference picture referred to by each B picture in the neighboring sequence of B pictures is invalid or missing.
Preferably, the one or more circuits are configured to interpolate a decoded picture corresponding to each B picture in the adjacent sequence of B pictures when a forward reference picture referred to by each B picture in the adjacent sequence of B pictures is invalid or missing.
Preferably, the one or more circuits include one or more processors configured to perform interpolation of decoded pictures, wherein when the forward reference picture is invalid but available, the interpolation operation is performed using the following algorithm:
(m +1-n)/(m +1)) + (n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
Preferably, the one or more circuits include one or more processors configured to perform interpolation of decoded pictures, wherein, when the forward reference picture is missing, the interpolation operation is performed using the following algorithm:
(n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
According to an aspect of the present invention, a video signal processing method includes: generating a B-picture neighbor sequence such that when a B-picture generated in the B-picture neighbor sequence is indicated as not referring to a forward reference picture, other B-pictures in the B-picture neighbor sequence that follow the B-picture are also indicated as not referring to the forward reference picture, wherein the B-picture neighbor sequence is generated immediately after a first I-picture that follows a sequence header.
According to an aspect of the present invention, a video signal processing system includes: one or more circuits configured to generate a neighboring sequence of B pictures such that when a generated B picture in the neighboring sequence of B pictures indicates no reference to a forward reference picture, other B pictures in the neighboring sequence of B pictures that follow the B picture also indicate no reference to the forward reference picture, wherein the neighboring sequence of B pictures is generated immediately after a first I picture following a sequence header.
Various advantages, aspects and novel features of the invention, as well as details of an illustrated embodiment thereof, will be more fully described with reference to the following description and drawings.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
fig. 1 is a block diagram of a part of a mobile terminal according to an embodiment of the present invention;
FIG. 2a is a schematic illustration of random access of video images for use in conjunction with an embodiment of the present invention;
FIG. 2b is a diagram illustrating the effect of editing video images according to an embodiment of the present invention;
FIG. 3 is a flow chart of decoding B pictures with invalid or missing forward reference pictures in an embodiment of the present invention;
FIG. 4 is a flow chart of a method of displaying B pictures with invalid or missing forward reference pictures in an embodiment of the present invention;
FIG. 5a is a flow chart of another method for displaying B pictures with invalid or missing forward reference pictures in an embodiment of the present invention;
FIG. 5B is a flow chart illustrating the encoding of a method for displaying B pictures with invalid or missing forward reference pictures according to an embodiment of the present invention;
fig. 6 is a flow chart of another method for displaying B pictures with invalid or missing forward reference pictures in an embodiment of the present invention.
Detailed Description
Some embodiments of the present invention relate to methods and systems for processing B-pictures with invalid or missing forward reference pictures. The invention includes decoding B pictures of an adjacent sequence, wherein the B pictures occur immediately following the first I picture of the sequence header. Decoding the B pictures may process video editing code, e.g., video data for AVS1-P2 format, and/or random access points. The video editing code may include, for example, information regarding whether there is an invalid forward reference picture immediately following the B picture. The video decoding operation at the random access point may start, for example, from the sequence header, where the forward reference picture has not yet been decoded. Accordingly, the forward reference picture may be missing.
Some embodiments of the invention may implement the discarding of each B picture in the sequence of adjacent B pictures following the first I picture of the sequence header. Other embodiments may implement a determination of whether the forward reference picture of each B picture in the sequence of neighboring B pictures is missing or invalid. Thus, if the B picture indicates that it refers to an invalid or missing forward reference picture, the B picture is discarded. Other embodiments of the present invention may enable interpolation of B-pictures based on forward and backward reference pictures.
For an invalid forward reference picture, the decoded picture used to interpolate the first of two B pictures in the adjacent sequence is described as follows: (2/3) (decoded forward reference picture) + (1/3) (decoded backward reference picture). Similarly, for an invalid forward reference picture, the decoded picture used for interpolating the second B picture in the adjacent sequence of B pictures is described as follows: (1/3) (decoded forward reference picture) + (2/3) (decoded backward reference picture). If the decoded forward reference picture is missing, e.g., random access occurs, the first B picture is interpolated as: (1/3) (decoded backward reference picture), the second B picture is interpolated as: (2/3) (decoded backward reference picture).
The interpolation for each B picture referring to the invalid forward reference picture can be generalized as: (m +1-n)/(m +1)) + the decoded forward reference picture (n/(m +1)) + the decoded backward reference picture, the interpolation of the B pictures for each reference missing forward reference picture is: (n/(m +1)) (decoded backward reference picture). Where the parameter "m" represents the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" represents the position of a B picture in the sequence. For example, for the first B picture, n is 1, the second B picture n is 2, and so on.
Fig. 1 is a block diagram of a part of a mobile terminal according to an embodiment of the present invention. A mobile terminal 100 is shown. As shown in fig. 1, the mobile terminal 100 includes an image sensor 110, an image processor 112, a processor 114, and a memory 116. Wherein the image sensor 110 may include appropriate circuitry and/or logic for capturing various colors of light intensity, such as red, green, and blue. The received light intensity level will be processed into a video and/or still image output. For example, these color levels are converted to the YUV color space and the resulting image information is passed to, for example, an image processor 112 for further processing.
Image processor 112 may comprise suitable circuitry and/or logic capable of video information processing. The image processor 112 also includes a video encoder 112a and a video decoder 112 b. The video encoder 112a may comprise suitable logic, circuitry, and/or code that may enable video data compression, among other things. The video decoder 112b may comprise suitable logic, circuitry, and/or code that may enable decompressed display of video data. The processor 114 will determine the operational mode of various portions of the overall mobile terminal 100. For example, the processor 114 may set up data registers within the image processor 112 to enable transfer of video data to the memory 116 through direct memory access. The processor may also initiate image capture by transmitting instructions to the image sensor 110. The memory 116 is used to store the image data processed and transferred by the image processor 112. The memory 116 may also be used for storing code and/or data used by the processor 114 and data needed for implementing other functions for the mobile terminal 100. For example, the memory 116 may also store corresponding data for voice communications.
In operation, the processor 114 initiates image capture functions with the image sensor 110. The image sensor 110 may transmit video data corresponding to the captured image to the image processor 112. A video encoder 112a in the image processor 112 may compress the video data to facilitate storage or transfer of the data to other devices. The image processor 112 may also decode video data transmitted to the mobile terminal 100. The decoding operation may be implemented by the video decoder 112B, which includes processing methods that can process B pictures with invalid or missing forward reference pictures. The processing of B pictures with invalid or missing forward reference pictures will be described in fig. 2 a-6. The video data in the memory 116 is further processed by the processor 114.
Fig. 2a is a schematic illustration of random access of video images in use in conjunction with an embodiment of the invention. A sequence of video information 200 is shown for decoding display, wherein the random access point is a sequence header 210. Since many video coding standards, including MPEG1, MPEG2, and AVS1-P2, utilize forward and backward reference pictures to generate image frames, control of video decoding is achieved through control of the video decoding start point. For example, some video pictures generated by video coding may include intra (I) pictures, bi-directional predicted (B) pictures, and predicted (P) pictures.
I pictures contain complete information for the display of the picture. A P picture contains distinguishing information between a previous I or P reference picture and a current P picture. The B picture includes information on a difference between a previous I or P reference picture and a current P picture, and information on a difference between a subsequent I or P reference picture and the current P picture. Since a P-picture or a B-picture may need to refer to a P-picture or a B-picture displayed thereafter, the encoded video file may contain pictures out of sequence, and thus the current picture may refer to a later picture that has already been decoded.
As shown in fig. 2a, a sequence header 210 containing information related to a decoding operation on compressed video data is a starting point of the video information 200. The sequence header 210 may include, for example, width and height information for the decompressed video image. The video image immediately following the sequence header 210 includes: i picture 212, B pictures 214 and 216, P picture 218, and B pictures 220 and 222. Also shown is a reference picture 205 from a previous video sequence for reference by B pictures 214 and 216.
Since an image needs to be displayed with reference to one or more subsequent images associated with it, the compression method employed can compress the subsequent image and place the subsequent image in a file preceding the P/B image that references the subsequent image. For example, although the image transfer in the video information sequence 200 proceeds as follows: i image 212, B image 214, B image 216, P image 218, B image 220, and B image 222, but the display order of these pictures is different, such as: b picture 214 is first followed by B picture 216, I picture 212, B picture 220, B picture 222, and P picture 218.
In some cases, a user in the mobile terminal 100 may request to start the display from a random access point of the video instead of the starting point of the video file. For example, a user may wish to start watching a video at some time offset from the start of the video. In this case, the video display includes processing of data starting at a sequence header near the user-selected point in time, for example, starting at sequence header 210. The sequence header selected as the random access point depends on the design. The sequence header may include useful information for the video decoder 112b to perform the proper decompression operation. P picture 218 may refer to other pictures in video sequence 200 that follow sequence header 210. However, B pictures 214 and 216 may refer to reference picture 205 in the previous video sequence before sequence header 210, and may also refer to backward reference picture 212.
Therefore, since the partial video to be decompressed needs to refer to the reference picture 205, the decompression of the B pictures 214 and 216 cannot be performed correctly. However, since B pictures are not used for reference of I pictures or B pictures, any artifacts in the decompressed B pictures 214 and 216 will only be present in the B pictures 214 and 216.
Fig. 2b is a schematic diagram illustrating the effect of editing a video image according to an embodiment of the present invention. The figure shows video information 250 to be decompressed for display. For example, video information 250 may include a first sequence header 260, an I picture 262, B pictures 264 and 266, a P picture 268, B pictures 270 and 272, a Video Editing Code (VEC)274, a second sequence header 276, an I picture 278, and B pictures 280 and 282. The decompressed images of video information 250 will be displayed in the following order: b picture 264 is first followed by B picture 266, I picture 262, B picture 270, B picture 272, P picture 268, B picture 280, B picture 282, and I picture 278.
The sequence headers 260 and 276, images 262 … 272 and 278 … 282 are similar to the corresponding parts described in fig. 2 a. The video editing code 274 is adopted by standards such as AVS1-P2 set by the chinese audio video coding standards working group. Optional video editing code, which is not a necessary component, is used to indicate that the consecutive B-pictures immediately following an I-picture may have invalid or missing forward reference pictures. For example, the video editing code 274 will indicate that the B pictures 280 and 282 were not decoded correctly. This would occur if a portion of the video file, including the image 262 … 272, were edited such that it had a different compression than the subsequent video image or the portion was deleted. Thus, even if the forward reference picture 268 has already been decoded and stored in the picture buffer, the decoding operations of the B pictures 280 and 282 will no longer be dependent on the forward reference picture 268.
The AVS1-P2 standard may allow B pictures to use the "no forward reference" flag. The flag may assert whether the B picture does not refer to the forward reference picture. If the "no forward reference" flag is not asserted, it indicates that the B picture can refer to the forward reference picture. Thus, by checking whether the "no forward reference" flag is asserted, video decoder 112b will take appropriate action in the event that a forward reference picture is invalid or missing. Said measures will be described in fig. 3 to 6.
Fig. 3 is a flow chart of decoding B pictures with invalid or missing forward reference pictures in an embodiment of the present invention. As shown, flow begins at step 300 and ends at step 310. In step 300, random access is indicated to the video decoder 112b, or a video editing code, such as the VEC274, is detected. Decoding of I pictures and/or P pictures can also be implemented in this step. In step 302, the video decoder 112B, which is capable of decoding B pictures with invalid or missing forward reference pictures, determines whether the subsequent start code corresponds to the sequence header. If the subsequent start code does not correspond to the sequence header, go to step 304; if the subsequent start code corresponds to a sequence header, e.g., sequence header 276, then step 306 is performed next.
Step 304 is an error state that facilitates handling of errors when an unexpected start code is encountered. The error handling depends on the design. Thus, in step 304, an error handling operation is performed when the expected start code for the sequence header is not detected. In step 306, the video decoder 112b determines whether the next start code corresponds to an I picture. If the two are corresponding, the I picture is decoded and step 308 is executed; if the two do not correspond, the next step is to execute step 304. In step 304, an error handling operation is performed when the expected I-picture is invalid or missing.
In step 308, if a B picture is detected and the "no forward reference" flag is not asserted at this time, then step 310 is next performed. If the "no forward reference" flag is asserted, the B picture will be processed. The processing operation on the B image will be described in fig. 4 to 6. Otherwise, if an I-picture and/or a P-picture is detected, the next step is to process the I-picture and/or the P-picture while waiting for the detection of the random access point indication or the video editing code, step 300.
Fig. 4 is a flow chart of a method for displaying B pictures with invalid or missing forward reference pictures in an embodiment of the present invention. As shown, the process begins at step 400 and ends at step 406, which may be part of steps 308 and/or 310. In step 400, it is determined whether the current picture is a B picture, if so, step 402 is performed, otherwise, step 300 is performed. In step 402, it is determined whether the B picture refers to a forward reference picture, if so, step 404 is performed, otherwise, step 406 is performed. In step 404, the B picture is discarded. Step 310 is then performed. In step 406, the B picture is also discarded, followed by step 308. Therefore, a B picture adjacent sequence, for example, B pictures 280 and 282, in which an I picture immediately following the sequence header appears, will be discarded.
Although two separate states 308 and 310 may be used by this method, a single state, including states 308 and 310, may be used in other embodiments of the present invention since B-pictures are discarded regardless of whether they refer to a forward reference picture.
Fig. 5a is a flow chart of another method for displaying B pictures with invalid or missing forward reference pictures in an embodiment of the present invention. As shown in FIG. 5a, flow begins at step 500 and ends at step 504, which may be part of step 310. In step 500, it is determined whether the current picture is a B picture, and if so, step 502 is performed, and if not, step 300 is performed. In step 502, it is determined whether the B picture refers to a forward reference picture, if so, step 504 is performed, otherwise, step 308 is performed. In step 504, the B picture is discarded. Since the B pictures in step 310 may refer to the forward reference pictures, the method may selectively discard those B pictures that refer to invalid or missing forward reference pictures. For example, a neighboring sequence of B pictures immediately following the sequence header, e.g., B pictures 280 and 282, would be discarded if an invalid or missing forward reference picture was referenced. Those B pictures that do not refer to the forward reference picture, such as the B picture processed in step 308, are decompressed and displayed.
Fig. 5B is a flow chart of the encoding of a method for displaying B pictures with invalid or missing forward reference pictures in an embodiment of the present invention. As shown, steps 510 to 524 are the flow of generating a non-dropping "no forward reference" flag for the B picture. Asserting the "no forward reference" flag in a B picture indicates that the B picture is not referring to a forward reference picture. For example, if a B picture 280 contains an asserted "no forward reference" flag, then decoding of the B picture 280 does not require reference to the P picture 268. Similarly, if a B picture 280 contains a de-asserted "no forward reference" flag, then a proper decoding operation on that B picture 280 requires reference to the P picture 268. This method is effective in reducing or eliminating chattering, which may occur, for example, when a decoded video image is displayed, due to the random setting of the "no forward reference" flag in the B-picture. Thus, if "no forward reference" is asserted for a certain B picture in a B picture neighbor sequence, the "no forward reference" flag for the remaining B pictures in the sequence will also be asserted.
In step 510, a sequence header of a video sequence is generated. In step 512, the video encoder 112a deasserts the master "no forward reference" flag. The status of the master "no forward reference" flag will also be copied to the "no forward reference" flag of each B picture header file. In step 514, if an I-picture or a P-picture is to be generated, then step 516 is performed next, otherwise step 518 is performed next. In step 516, an I-picture or P-picture is generated that fits into a video sequence. Step 514 is performed next. In step 518, the video encoder 112a will determine whether a B picture, e.g., B picture 280, refers to a forward reference picture, e.g., P picture 268. If so, step 520 is performed next, otherwise step 522 is performed.
In step 520, the video encoder 112a determines whether the primary "no forward reference" flag therein is asserted. If so, step 522 is next performed, otherwise step 524 is performed. In step 522, the "no forward reference" flag in the B picture header file is set to an asserted state. Thus, once the master "no forward reference" flag is asserted, the "no forward reference" flag in the B picture header file will be set to the asserted state regardless of whether the B picture refers to a forward reference picture. In step 524, if the data block to be generated by the video encoder 112a is next a sequence header, then step 510 is performed next, otherwise step 514 is performed next.
Fig. 6 is a flow chart of another method for displaying B pictures with invalid or missing forward reference pictures in an embodiment of the present invention. Flow begins at step 600 and ends at step 610, which is part of step 310. The steps 600 to 610 are a process flow for processing a video sequence, such as the video sequence described in fig. 2a and/or 2 b.
In step 600, it is checked whether the counter for counting B pictures is zero. If so, the first B picture is processed in step 602, otherwise, the second B picture is processed in step 608. In step 602, the first B picture that references an invalid forward reference picture is replaced with the appropriate interpolated picture. The interpolation operation depends on the design. The interpolation algorithm for the B image will be described after step 610. If an invalid forward reference picture is used, artifacts will be seen in the display of the B picture, for example, on a display (not shown) of the mobile terminal 100. In step 604, the counter that counts the B pictures is incremented. In step 606, the next picture, which may be a B picture, is parsed to determine if the "no forward reference" flag is asserted. If so, step 308 is performed next, otherwise step 600 is performed.
In step 608, the second B picture that references the invalid forward reference picture is replaced with the appropriate interpolated picture. In step 610, the counter for counting B pictures is cleared, and step 300 is performed next.
In some cases, the first B picture 280 and/or the second B picture 282 that follows the I picture 278 will be replaced by a suitable interpolation picture. Although a number of interpolation methods are available, an example method used here is to weight the decoded backward reference picture I2 'corresponding to the I picture 278 and the decoded forward reference picture P5' corresponding to the P picture 268 to interpolate a new B picture. For example, when the video editing code 274 indicates that the forward reference picture is invalid, the weighting operations performed to generate decoded pictures corresponding to the B0 picture 280 and the B1 picture 282, respectively, are as follows:
B0=(2/3)*(P5’)+(1/3)*(I2’)
B1=(1/3)*(P5’)+(2/3)*(I2’)。
when the decoded forward reference picture is not available, e.g., random access occurs, the weighting operations for generating the decoding maps corresponding to the B0 picture 280 and the B1 picture 282, respectively, are as follows:
B0=(1/3)*(I2’)
B1=(2/3)*(I2’)。
thus, a B-picture may be described as fading in or out from a previous video sequence to a current video sequence. The interpolation operation is performed, for example, by the processor 114 and/or the image processor 112.
The linear interpolation for each B picture referring to the invalid forward reference picture is: (m +1-n)/(m +1)) + a decoded forward reference picture (n/(m +1)) + a decoded backward reference picture, and the linear interpolation performed on the B picture referring to the missing forward reference picture is (n/(m +1)) + a decoded backward reference picture. Wherein the parameter "m" is the number of B pictures in said adjacent sequence of B pictures, and the parameter "n" is the position of a B picture in the sequence of B pictures. For example, for the adjacent sequence of B-pictures 280 and 282 in fig. 2B, the parameter m has a value of 2, and n of B-picture 280 is 1 and n of B-picture 282 is 2.
Thus, by appropriate processing of the video file containing the sequence header and/or the video file containing the video editing code, many embodiments of the present invention do not generate artifacts in displaying the video information in the event that the B picture does not have a valid forward reference picture.
According to one embodiment of the invention, the system of the invention may include an image processor 112, wherein the image processor 112 includes a video encoder 112a and a video decoder 112 b. The video decoder 112B may decode a sequence of neighboring B pictures in the compressed video data, such as the first I picture following the sequence header 276, e.g., the B pictures 280 and 282 following the I picture 278. The video decoder 112b may process video editing codes and/or random access points within video data in, for example, the AVS1-P2 format. In some embodiments of the present invention, video decoder 112B will discard the adjacent sequence of B pictures 280 and 282.
In other embodiments of the present invention, video decoder 112B can determine whether the forward reference pictures for B pictures 280 and 282 (e.g., P picture 268) are invalid or missing. This determination may be made by determining whether the "no forward reference" flag is asserted. When "no forward reference" is asserted, it indicates that the B picture has no forward reference picture. The "no forward reference" flag may be part of each image compressed by, for example, the AVS1-P2 standard. If the B video decoder 112B discards each B picture in the adjacent sequence of B pictures that references an invalid or missing forward reference picture.
Some embodiments of the invention generate B pictures 280 and 282 by video encoder 112 a. In an embodiment of the present invention, in the event that the "no forward reference" flag of the B picture 280 is asserted, the video encoder 112a may assert the "no forward reference" flag of the B picture 282 as well. This approach may reduce the generation of artifacts in decoding compressed images. Although B-pictures 280 and 282 are described herein as examples, the algorithm is equally applicable to other video sequences that can generate more than two adjacent B-pictures. Thus, when the "no-forward-reference" flag of a B picture in a B-picture neighboring sequence generated immediately after the first I picture following the sequence header is asserted, the "no-forward-reference" flag of the remaining B pictures in the sequence is also asserted.
In the case where, for example, the B picture 280 indicates, by its asserted "no forward reference" flag, that the B picture refers to a forward reference picture (e.g., P picture 268), the decoding operation of the B picture 280 will not work properly because the video editing code 274 indicates that the P picture 268 is not a valid forward reference picture for the B pictures 280 and 282. Thus, rather than referencing the P picture 268 or discarding the B pictures 280 and/or 282, an interpolated decoded picture can be generated that can be linearly or non-linearly interpolated. The interpolation operation may be performed by the video encoder 112a and/or the processor 114 in the image processor 112.
In some embodiments of the present invention, a linear interpolation method for interpolating the B images 280 and 282 is used as described in the following example. For example, when a decoded forward reference picture is available but not valid, the linear interpolation for the B picture 280 is described as: (2/3) (decoded forward reference picture) + (1/3) (decoded backward reference picture). Similarly, the linear interpolation for the B image 282 is described as: (1/3) (decoded forward reference picture) + (2/3) (decoded backward reference picture).
When the decoded forward reference picture is not available, e.g., random access occurs, the linear interpolation performed on the B picture 280 is: (1/3) (decoded backward reference picture), linear interpolation of the B picture 282 is: (2/3) (decoded backward reference picture). Thus, this interpolation operation for the B images 280 and 282 is similar to a fade-in or fade-out process.
The linear interpolation for the B picture of each forward reference picture with invalid reference can be summarized as: ((m +1-n)/(m + 1))/(decoded forward reference picture) + (n/(m + 1))/(decoded backward reference picture), and linear interpolation performed on B pictures referring to the missing forward reference picture can be generalized to (n/(m + 1))/(decoded backward reference picture). Where the parameter "m" is the number of B pictures in said adjacent sequence of B pictures and the parameter "n" is the position of a B picture in the sequence. For example, for B-pictures 280 and 282 in fig. 2B, the parameter m has a value of 2, and n of B-picture 280 equals 1 and n of B-picture 282 equals 2.
Numerous embodiments of the present invention may include decoding the compressed video by, for example, video decoder 112b, the flow of which is described in the flow chart of fig. 3. Thus, when random access or video editing code is detected, there is a transition from the first state or step 300 to the second state or step 302. When a sequence header is detected, a transition is made from the second state to a third state or step 306. When other data blocks, such as I-pictures, P-pictures, B-pictures or video editing codes, are detected in the second state, a transition is made to the error state or step 304.
When an I-picture is detected, a transition is made from the third state to a fourth state or step 308. When other data blocks such as I-pictures, P-pictures, B-pictures, video editing codes or sequence headers are detected in the third state, a transition is made to the error state. When a B picture referencing a forward reference picture is detected, a transition is made from the fourth state to the fifth state or step 310. When an I picture or a P picture is detected, a transition is made from the fourth state to the first state. When a B picture not referring to the forward reference picture is detected, a transition is made from the fifth state to the fourth state. When an I picture or a P picture is detected, the fifth state is changed to the first state.
In an embodiment of this disclosure, video decoder 112B may discard B pictures in the fourth and fifth states. Thus, the method can discard a B picture neighbor sequence that immediately follows the first I picture after the sequence header. In other embodiments of this disclosure, the video decoder 112B may discard the B-pictures in the fifth state. Thus, the method can discard B pictures that reference the forward reference picture in a B picture neighboring sequence that follows the first I picture immediately after the sequence header.
Other embodiments of the present invention interpolate decoded pictures corresponding to a sequence of B pictures that are adjacent to the first I picture that follows the sequence header. A typical example of a contiguous sequence of B pictures includes two B pictures, e.g., B pictures 280 and 282, which may be referenced to a forward reference picture, e.g., P picture 268, and a backward reference picture, e.g., I picture 278. When the decoded forward reference picture is available but not valid, the interpolated picture corresponding to the B picture 280 is generated by the following algorithm: (2/3) (decoded forward reference picture) + (1/3) (decoded backward reference picture). Similarly, an interpolated image corresponding to the B image 282 is generated by the following algorithm: (1/3) (decoded forward reference picture) + (2/3) (decoded backward reference picture).
When the decoded forward reference picture is not available, e.g., random access occurs, the linear interpolation performed on the B picture 280 is: (1/3) (decoded backward reference picture), linear interpolation of the B picture 282 is: (2/3) (decoded backward reference picture). Thus, this interpolation operation for the B images 280 and 282 is similar to a fade-in or fade-out process.
The linear interpolation for the B picture of each forward reference picture with invalid reference can be summarized as: ((m +1-n)/(m + 1))/(decoded forward reference picture) + (n/(m + 1))/(decoded backward reference picture), and linear interpolation performed on B pictures referring to the missing forward reference picture can be generalized to (n/(m + 1))/(decoded backward reference picture). Wherein the parameter "m" is the number of B pictures in the B picture neighboring sequence, and the parameter "n" is the position of the B picture in the B picture neighboring sequence. For example, for B-pictures 280 and 282 in fig. 2B, the parameter m has a value of 2, and n of B-picture 280 equals 1 and n of B-picture 282 equals 2.
One embodiment of the invention provides a machine-readable storage having stored thereon a computer program. The program comprises at least one piece of code for processing a B picture with a missing or invalid forward reference, the at least one piece of code being executable by a machine for enabling the machine to perform the method steps described in the present application.
Accordingly, the present invention may be implemented in hardware, software, firmware, or various combinations thereof. The present invention can be realized in a centralized fashion in at least one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware, software and firmware may be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
The present invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. The computer program in this document refers to: any expression, in any programming language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to other languages, codes or symbols; b) reproduced in a different format. However, other meanings of computer program that can be understood by those skilled in the art are also encompassed by the present invention.
While the invention has been described with reference to several particular embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from its scope. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.
Claims (10)
1. A method of video signal processing, the method comprising:
after receiving a sequence header in compressed video data, judging whether a continuous adjacent B image sequence is closely followed by a first I image in the compressed video data;
and decoding the adjacent B picture sequence according to the judgment, wherein the decoding operation carries out B picture discarding or corresponding interpolation processing according to a video editing code and/or a random access point, and the video editing code is used for indicating that a continuous B picture which is next to an I picture possibly has invalid or missing forward reference pictures.
2. The method of claim 1, further comprising: discarding each B picture in the adjacent sequence of B pictures determined to immediately follow the first I picture.
3. The method of claim 1, further comprising: discarding each B picture in the neighboring sequence of B pictures when a forward reference picture referenced by each B picture in the neighboring sequence of B pictures is invalid or missing.
4. The method of claim 1, further comprising: interpolating a decoded picture corresponding to each B picture in the adjacent sequence of B pictures when a forward reference picture referred to by each B picture in the adjacent sequence of B pictures is invalid or missing.
5. The method of claim 4, wherein when the forward reference picture is invalid but available, the interpolation operation is performed using the following algorithm:
(m +1-n)/(m +1)) + (n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
6. A video signal processing system, the system comprising:
one or more circuits configured to determine whether a first I picture in the compressed video data is immediately followed by a sequence of consecutive adjacent B pictures, and decode the sequence of consecutive B pictures according to the determination, wherein the decoding operation performs B picture dropping or corresponding interpolation processing according to a video editing code and/or a random access point, the video editing code indicating that the consecutive B pictures immediately following the I picture may have invalid or missing forward reference pictures.
7. The system according to claim 6, wherein said one or more circuits discard each B picture in a contiguous sequence of B pictures determined to immediately follow said first I picture.
8. The system according to claim 6, wherein said one or more circuits are operable to discard each B picture in said neighboring sequence of B pictures when a forward reference picture referenced by each B picture in said neighboring sequence of B pictures is invalid or missing.
9. The system according to claim 6, wherein said one or more circuits are operable to interpolate a decoded picture corresponding to each B picture in said adjacent sequence of B pictures when a forward reference picture referenced by each B picture in said adjacent sequence of B pictures is invalid or missing.
10. The system according to claim 9, wherein said one or more circuits comprise one or more processors configured to perform interpolation of decoded pictures, wherein when said forward reference picture is invalid but available, said interpolation is performed using an algorithm comprising:
(m +1-n)/(m +1)) + (n/(m +1)) + (decoded backward reference picture),
wherein the parameter "m" is the number of B pictures in the adjacent sequence of B pictures, and the parameter "n" is a 1-based index of the B pictures.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US11/871,385 | 2007-10-12 | ||
| US11/871,385 US8194741B2 (en) | 2007-10-12 | 2007-10-12 | Method and system for processing B pictures with missing or invalid forward reference pictures |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1131298A1 HK1131298A1 (en) | 2010-01-15 |
| HK1131298B true HK1131298B (en) | 2011-12-09 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9712838B2 (en) | Video encoding apparatus, video decoding apparatus, video encoding method, and video decoding method | |
| JP4928726B2 (en) | Indication of valid entry points in the video stream | |
| CN107105277B (en) | Video decoding method | |
| US8879630B2 (en) | Method and system for processing B pictures with missing or invalid forward reference pictures | |
| US20110081133A1 (en) | Method and system for a fast channel change in 3d video | |
| KR20110033240A (en) | Method for Error Concealment Due to Packet Loss of Enhancement Layer in Scalable Video Coding (SCC) Decoding | |
| US8811483B2 (en) | Video processing apparatus and method | |
| CN101379828A (en) | Method and apparatus for adaptive group of pictures (GOP) structure selection | |
| CN1153440A (en) | Digital signal decoding apparatus | |
| US7487423B2 (en) | Decoding method, medium, and apparatus | |
| US7333711B2 (en) | Data distribution apparatus and method, and data distribution system | |
| WO2004066636A1 (en) | Image encoding method | |
| US20080181314A1 (en) | Image coding apparatus and image coding method | |
| CN100474914C (en) | Method and apparatus for controlling a bit rate of digital video data | |
| HK1131298B (en) | A method and system for processing video signal | |
| WO2004004357A1 (en) | Moving picture encoding method, decoding method, data stream, data recording medium, and program | |
| EP1806929A1 (en) | Transcoding video information for digital video recording (DVR) trick modes | |
| JP2006203598A (en) | Digital image decoder and decoding method | |
| TWI495299B (en) | Method and system for managing an energy efficient network utilizing audio video bridging | |
| JP2002077922A (en) | Image control system and image decoding device | |
| US20110299591A1 (en) | Video processing apparatus and method | |
| JP2002058023A (en) | Encoded image signal transmission system | |
| KR20040077765A (en) | Methods and systems for encoding and decoding video data to enable random access and splicing | |
| JP2000295567A (en) | Coded data editor | |
| JPH09139916A (en) | Image data encoding device |