HK1179791B - Unified binarization for cabac/cavlc entropy coding - Google Patents
Unified binarization for cabac/cavlc entropy coding Download PDFInfo
- Publication number
- HK1179791B HK1179791B HK13106800.2A HK13106800A HK1179791B HK 1179791 B HK1179791 B HK 1179791B HK 13106800 A HK13106800 A HK 13106800A HK 1179791 B HK1179791 B HK 1179791B
- Authority
- HK
- Hong Kong
- Prior art keywords
- cabac
- cavlc
- bits
- complexity
- coding
- Prior art date
Links
Abstract
Unified binarization for CABAC/CAVLC entropy coding is disclosed. Scalable entropy coding is implemented in accordance with any desired degree of complexity (e.g., entropy encoding and/or decoding). For example, appropriately implemented context-adaptive variable-length coding (CAVLC) and context-adaptive binary arithmetic coding (CABAC) allow for selective entropy coding in accordance with a number of different degrees of complexity. A given device may operate in accordance with a first level complexity a first time, a second level complexity of the second time, and so on. Appropriate coordination and signaling between an encoder/transmitter device and a decoder/receiver device allows for appropriate coordination along a desired degree of complexity. For example, a variable length binarization module and an arithmetic encoding module may be implemented within an encoder/transmitter device and a corresponding arithmetic decoding module and a variable length bin decoding module may be implemented within a decoder/receiver device allowing for entropy coding along various degrees of complexity.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority from U.S. provisional patent application 61/515,819 filed on 5/2011 and U.S. utility patent application 13/523,818 filed on 14/2012, the entire contents of which are incorporated herein by reference.
Technical Field
The present invention relates generally to digital video processing; and more particularly to signaling in accordance with such digital video processing.
Background
Communication systems that operate to communicate digital media (e.g., images, video, data, etc.) have been continuously developed for many years. For communication systems that employ some form of video data, many digital images are output or displayed at a certain frame rate (e.g., frames per second) to achieve a video signal suitable for output and consumption. In many such communication systems that operate using video data, there may be a tradeoff between throughput (e.g., the number of frames of images that may be transmitted from a first location to a second location) and the video and/or image quality of the signal that is ultimately to be output or displayed. The prior art has not adequately or desirably provided a means by which to ensure a relatively low amount of overhead associated with communication, a relatively low complexity of communication devices at each end of a communication link, etc., in terms of being able to transmit video data from a first location to a second location under adequate or acceptable video and/or image quality conditions.
Disclosure of Invention
According to an aspect of the invention, there is provided an apparatus comprising: an input to receive a plurality of syntax elements; an entropy encoder operative to adaptively perform entropy encoding according to a plurality of complexity operation modes, the entropy encoder comprising: a variable length binarization module to process the plurality of syntax elements to generate a plurality of Context Adaptive Binary Arithmetic Coding (CABAC) bins or a plurality of Context Adaptive Variable Length Coding (CAVLC) bits; and an arithmetic coding module that processes the plurality of CABAC bins to produce a plurality of CABAC bits; and at least one output to output the CABAC bits when the entropy encoder operates according to a first one of the plurality of complexity operation modes and to output the CAVLC bits when the entropy encoder operates according to a second one of the plurality of complexity operation modes.
Preferably, wherein the plurality of CABAC bins are the plurality of CAVLC bits.
Preferably, wherein the entropy encoder performs entropy encoding according to the first complexity operational mode of the plurality of complexity operational modes at or during a first time; and an entropy encoder performs entropy encoding according to the second complexity operational mode of the plurality of complexity operational modes at or during a second time; and the second one of the plurality of complexity operation modes is less complex than the first one of the plurality of complexity operation modes.
Preferably, wherein the apparatus is a first communication device; and further comprising: a second communication device that communicates with the first communication device via at least one communication channel, the second communication device comprising: at least one additional input receiving the plurality of CABAC bits or the plurality of CAVLC bits; and an entropy decoder comprising: an arithmetic decoding module that processes the plurality of CABAC bits to produce a plurality of CABAC bins; and a variable length bin decoding module that processes the plurality of CAVLC bits or the plurality of CABAC bins to produce a plurality of estimates of the plurality of syntax elements; and wherein: the second communication device is at least one of a computer, a laptop computer, a High Definition (HD) television, a Standard Definition (SD) television, a handheld media unit, a set-top box (STB), and a Digital Video Disc (DVD) player.
Preferably, wherein the apparatus is a communication device operating in at least one of a satellite communication system, a wireless communication system, a wired communication system, a fiber optic communication system and a mobile communication system.
According to another aspect of the present invention, there is provided an apparatus comprising: an input to receive a plurality of syntax elements; an entropy encoder comprising: a variable length binarization module to process the plurality of syntax elements to generate a plurality of Context Adaptive Binary Arithmetic Coding (CABAC) bins or a plurality of Context Adaptive Variable Length Coding (CAVLC) bits; and an arithmetic coding module that processes the plurality of CABAC bins to produce a plurality of CABAC bits; and at least one output selectively outputting the plurality of CABAC bits or the plurality of CAVLC bits.
Preferably, wherein the plurality of CABAC bins are the plurality of CAVLC bits.
Preferably, wherein the entropy encoder is operative to adaptively perform entropy encoding according to a plurality of complexity operation modes.
Preferably, wherein the entropy encoder performs entropy encoding according to the first complexity operational mode at or during the first time; and the entropy encoder performs entropy encoding according to a second complexity operational mode at or during a second time.
Preferably, wherein the at least one output outputs the CABAC bits when the entropy encoder operates according to a first complexity operational mode; and when the entropy encoder operates according to a second complexity operation mode that is less complex than the first complexity operation mode, the at least one output outputs the CAVLC bits.
Preferably, the apparatus further comprises: at least one additional input receiving the plurality of CABAC bits or the plurality of CAVLC bits; and an entropy decoder comprising: an arithmetic decoding module that processes the plurality of CABAC bits to produce a plurality of CABAC bins; and a variable length bin decoding module that processes the plurality of CAVLC bits or the plurality of CABAC bins to produce a plurality of estimates of the plurality of syntax elements.
Preferably, wherein the apparatus is a first communication device; and further comprising: a second communication device that communicates with the first communication device via at least one communication channel, the second communication device comprising: at least one additional input receiving the plurality of CABAC bits or the plurality of CAVLC bits; and an entropy decoder comprising: an arithmetic decoding module that processes the plurality of CABAC bits to produce a plurality of CABAC bins; and a variable length bin decoding module that processes the plurality of CAVLC bits or the plurality of CABAC bins to produce a plurality of estimates of the plurality of syntax elements; and wherein: the second communication device is at least one of a computer, a laptop computer, a High Definition (HD) television, a Standard Definition (SD) television, a handheld media unit, a set-top box (STB), and a Digital Video Disc (DVD) player.
Preferably, wherein the apparatus is a communication device operating in at least one of a satellite communication system, a wireless communication system, a wired communication system, a fiber optic communication system and a mobile communication system.
According to yet another aspect of the present invention, there is provided a method for operating an entropy encoder of a communication device, the method comprising: receiving a plurality of syntax elements; operating a variable length binarization module of the entropy encoder to process the plurality of syntax elements to produce a plurality of Context Adaptive Binary Arithmetic Coding (CABAC) bins or a plurality of Context Adaptive Variable Length Coding (CAVLC) bits; and operating an arithmetic coding module of the entropy encoder to process the plurality of CABAC bins to produce a plurality of CABAC bits; and selectively outputting the plurality of CABAC bits or the plurality of CAVLC bits.
Preferably, wherein the plurality of CABAC bins are the plurality of CAVLC bits.
Preferably, wherein the entropy encoder is operative to adaptively perform entropy encoding according to a plurality of complexity operation modes.
The method further comprises the following steps: operating the entropy encoder to perform entropy encoding according to a first complexity mode of operation at or during a first time; and operating the entropy encoder to perform entropy encoding according to a second complexity operational mode at or during a second time.
The method further comprises the following steps: outputting the CABAC bits when the entropy encoder operates according to a first complexity operation mode; and outputting the CAVLC bits when the entropy encoder operates according to a second complexity operation mode that is less complex than the first complexity operation mode.
The method further comprises the following steps: operating an additional communication device to communicate with the communication device via at least one communication channel by: receiving the plurality of CABAC bits or the plurality of CAVLC bits; operating an arithmetic decoding module to process the plurality of CABAC bits to produce a plurality of CABAC bins; operating a variable length bin decoding module to process the plurality of CAVLC bits or the plurality of CABAC bins to generate a plurality of estimates of the plurality of syntax elements, wherein the additional communication device is at least one of a computer, a laptop computer, a High Definition (HD) television, a Standard Definition (SD) television, a handheld media unit, a Set Top Box (STB), and a Digital Video Disc (DVD) player.
Preferably, wherein: the communication device operates in at least one of a satellite communication system, a wireless communication system, a wired communication system, a fiber optic communication system, and a mobile communication system.
Drawings
Fig. 1 and 2 illustrate various embodiments of a communication system.
FIG. 3A illustrates an embodiment of a computer.
FIG. 3B illustrates an embodiment of a laptop computer.
Fig. 3C shows an embodiment of a High Definition (HD) television set.
Fig. 3D shows an embodiment of a Standard Definition (SD) television set.
Fig. 3E illustrates an embodiment of a handheld media unit.
Fig. 3F shows an embodiment of a set-top box (STB).
Figure 3G shows an embodiment of a Digital Video Disc (DVD) player.
Fig. 3H illustrates an embodiment of a general digital image and/or video processing apparatus.
Fig. 4, 5 and 6 are diagrams illustrating various embodiments of video encoding architectures.
Fig. 7 is a diagram illustrating an embodiment of an intra prediction process.
Fig. 8 is a diagram illustrating an embodiment of an inter prediction process.
Fig. 9 and 10 are diagrams illustrating various embodiments of video decoding architectures.
Fig. 11 illustrates an embodiment of a table showing binarization of macroblock quantizer change step syntax elements (macroblock quantization) for Context Adaptive Binary Arithmetic Coding (CABAC) and Context Adaptive Variable Length Coding (CAVLC) entropy coding.
Fig. 12 shows an embodiment of separate and corresponding architectures for CABAC and CAVLC entropy coding, respectively.
Fig. 13 shows an embodiment of a separate and corresponding architecture for CABAC/CAVLC entropy decoding, respectively.
Fig. 14 shows an embodiment of separate and corresponding architectures for CABAC and CAVLC entropy coding, respectively, and in particular, where Variable Length Coding (VLC) coding is utilized.
Fig. 15 shows an embodiment of separate and corresponding architectures for CABAC and CAVLC entropy decoding, respectively, and in particular where VLC decoding is utilized.
Fig. 16A shows an embodiment of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, where Variable Length Coding (VLC) coding is utilized.
Fig. 16B illustrates an embodiment of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where Variable Length Coding (VLC) decoding is utilized.
Fig. 17A shows an embodiment of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, where variable length binarization is utilized.
Fig. 17B shows an embodiment of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where variable length bin (bin, binary file) decoding is utilized.
Fig. 18A shows an embodiment of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, variable length binarization with increased complexity, and also in particular arithmetic coding with reduced complexity for CABAC entropy coding.
Fig. 18B illustrates an embodiment of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, variable length bin decoding with increased complexity according to CABAC entropy decoding, and also in particular arithmetic decoding with reduced complexity for CABAC entropy decoding.
Fig. 19A shows an embodiment of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, where variable length binarization with increased complexity is utilized.
Fig. 19B illustrates an embodiment of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where variable length bin decoding with increased complexity is utilized in accordance with CABAC entropy decoding.
Fig. 20A shows an embodiment of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, where variable length binarization with reduced complexity is utilized.
Fig. 20B illustrates an embodiment of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where reduced complexity variable length bin decoding is utilized in accordance with CABAC entropy decoding.
Fig. 21, 22A, 22B, 23, 24A and 24B illustrate various embodiments of methods performed in accordance with video encoding (e.g., within one or more communication devices).
Detailed Description
In many devices that use digital media, such as digital video, pixels are used to present corresponding images (digital in nature) of the digital media. In some communication systems, digital media may be transferred from a first location to a second location where the media may be output or displayed. Digital communication systems (including those operating to communicate digital video) aim to transmit digital data from one location or subsystem to another without error or with an acceptably low error rate. As shown in fig. 1, data may be transmitted over a wide variety of communication channels in a wide variety of communication systems: magnetic media, wired, wireless, fiber optic, copper, and/or other types of media.
Fig. 1 and 2 are diagrams illustrating various embodiments of communication systems 100 and 200, respectively.
Referring to fig. 1, an embodiment of a communication system 100 is a communication channel 199 that communicatively couples a communication device 110 (including a transmitter 112 with an encoder 114 and including a receiver 116 with a decoder 118) at one end of the communication channel 199 to another communication device 120 (including a transmitter 126 with an encoder 128 and including a receiver 122 with a decoder 124) at the other end of the communication channel 199. In some embodiments, either of communication devices 110 and 120 may include only a transmitter or only a receiver. There are several different types of media through which communication channels 199 may be implemented (e.g., satellite communication channel 130 using satellite dishes 132 and 134, wireless communication channel 140 using communication towers 142 and 144 and/or local antennas 152 and 154, wired communication channel 150, and/or fiber optic communication channel 160 using an electro-optical (E/O) interface 162 and an electro-optical (E/O) interface 164). In addition, multiple types of media may be implemented and interfaced together to form communication channel 199.
It is noted that such communication devices 110 and/or 120 may be stationary or mobile without departing from the scope and spirit of the present invention. For example, either or both of communication devices 110 and 120 may be implemented in a fixed location or may be mobile communication devices, with the following capabilities: associated with and/or communicating with more than one network access point (e.g., different respective Access Points (APs)) in the context of a mobile communication system that includes one or more Wireless Local Area Networks (WLANs), different respective satellites in the context of a mobile communication system that includes one or more satellites, or different respective network access points in the context of a mobile communication system that generally includes one or more network access points through which communication with communication devices 110 and/or 120 may be effectuated.
In order to reduce transmission errors that may be undesirably generated within a communication system, error correction and channel coding schemes are typically utilized. In general, these error correction and channel coding schemes involve the use of an encoder at the transmitter end of communication channel 199 and a decoder at the receiver end of communication channel 199.
Any of the various types of ECC codes may be used in any such desired communication system (e.g., including those variations described with respect to fig. 1), any information storage device (e.g., Hard Disk Drive (HDD), network information storage device and/or server, etc.), or any application where encoding and/or decoding of information is desired.
In general, when considering a communication system in which video data is communicated from one location or subsystem to another location or subsystem, video data encoding may generally be considered to be performed at a transmitting end of communication channel 199, while video data decoding may generally be considered to be performed at a receiving end of communication channel 199.
Furthermore, although the embodiments of the present figure show bi-directional communication between communication devices 110 and 120 being possible, it should of course be noted that in some embodiments, communication device 110 may only include video data encoding capabilities and communication device 120 may only include video data decoding capabilities, and vice versa (e.g., in a one-way communication embodiment such as according to a video broadcast embodiment).
Referring to the communication system 200 of fig. 2, at the transmit end of the communication channel 299, information bits 201 (e.g., corresponding, inter alia, to video data in one embodiment) are provided to a transmitter 297, the transmitter 297 being operable to perform encoding of the information bits 201 using an encoder and symbol mapper 220 (which may be viewed as different functional blocks 222 and 224, respectively) to produce a sequence of discrete-valued modulation symbols 203 that are provided to a transmit driver 230, the transmit driver 230 using a DAC (digital-to-analog converter) 232 to produce a continuous-time transmit signal 204 and using a transmit filter 234 to produce a filtered, continuous-time transmit signal 205 (which substantially fits the communication channel 299). At the receiving end of communication channel 299, continuous-time received signal 206 is provided to AFE (analog front end) 260, which includes receive filter 262 (which produces filtered, continuous-time received signal 207) and ADC (analog-to-digital converter) 264 (which produces time-discrete received signal 208). Metric generator 270 computes metrics 209 (e.g., in symbols and/or bits) employed by decoder 280 for making the best estimate of the discrete-valued modulation symbols and information bits encoded in 210.
Any desired integration of the various components, blocks, functional blocks, circuits, etc. may be performed within each of the transmitter 297 and receiver 298. For example, the figure shows processing module 280a as including encoder and symbol mapper 220 and all associated, corresponding components therein, and processing module 280b as including metric generator 270 and decoder 280 and all associated, corresponding components therein. Such processing modules 280a and 280b may be respective integrated circuits. Of course, other demarcations and groupings may alternatively be made without departing from the scope and spirit of the present invention. For example, all components within the transmitter 297 may be contained within a first processing module or integrated circuit, and all components within the receiver 298 may be contained within a second processing module or integrated circuit. Alternatively, any other combination of components within each transmitter 297 and receiver 298 may be made in other embodiments.
As with the previous embodiments, such a communication system 200 may be used for communication of video data traveling from one location or subsystem to another location or subsystem (e.g., from a transmitter 297 to a receiver 298 via a communication channel 299).
Digital images and/or media (including corresponding images within digital video signals) and/or video processing of digital images may be performed by any of the various means described below with respect to fig. 3A-3H to allow a user to view such digital images and/or video. These various means do not include an exhaustive list of means by which the image and/or video processing described herein may be implemented, and it is noted that any general purpose digital image and/or video processing means may be implemented to perform the processing described herein without departing from the scope and spirit of the present invention.
Fig. 3A illustrates an embodiment of a computer 301. The computer 301 may be a desktop computer or an enterprise storage device (such as a server) attached to a host computer of a storage array (such as a Redundant Array of Independent Disks (RAID) array), a storage router, an edge router, a storage switch, and/or a storage director. A user can view still digital images and/or video (e.g., a sequence of digital images) using the computer 301. Typically, a variety of image and/or video viewing programs and/or media player programs are included on the computer 301 to allow a user to view such images (including video).
Fig. 3B illustrates an embodiment of laptop computer 302. Such a laptop computer 302 may be found and used in any of a variety of situations. In recent years, as the processing power and functionality built into laptop computers has increased, they are being used in many situations where high-end and more capable desktop computers were previously to be used. Like computer 301, laptop computer 302 may include a variety of image viewing programs and/or media player programs to allow a user to view such images (including video).
Fig. 3C shows an embodiment of a High Definition (HD) television 303. Many HD television sets 303 include integrated tuners to allow media content (e.g., television broadcast signals) to be received, processed, and decoded thereon. Instead, the HD television 303 sometimes receives media content from another source (such as a Digital Video Disc (DVD) player, a Set Top Box (STB)) that receives, processes, and decodes cable and/or satellite television broadcast signals. Regardless of the particular implementation, the HD television 303 may be implemented to perform image and/or video processing as described herein. In general, the HD television 303 has the capability to display HD media content and is sometimes implemented with a 16: 9 wide aspect ratio.
Fig. 3D shows an embodiment of a Standard Definition (SD) television set 304. Of course, the SD television 304 is somewhat similar to the HD television 303, with at least one difference being that the SD television 304 does not have the capability to display HD media content, and the SD television 304 is typically implemented with a 4: 3 full-screen aspect ratio. Nonetheless, in practice the SD television 304 may be implemented to perform image and/or video processing as described herein.
Fig. 3E illustrates an embodiment of handheld media unit 305. Handheld media unit 305 is operable to provide storage of general storage or image/video content information such as Joint Photographic Experts Group (JPEG) files, Tagged Image File Format (TIFF), bitmaps, Moving Picture Experts Group (MPEG) files, windows media (WMA/WMV) files, other types of video content that are played back to the user such as MPEG4 files, and/or any other type of information that can be stored in a digital format. Historically, such handheld media units have been used primarily for audio media storage and playback; however, such handheld media unit 305 may be used for storage and playback of any virtual media (e.g., audio media, video media, photo media, etc.). In addition, such handheld media unit 305 may also include other functionality such as integrated communication circuitry for wired and wireless communication. Such handheld media unit 305 may be implemented to perform image and/or video processing as described herein.
Fig. 3F illustrates an embodiment of a Set Top Box (STB) 306. As mentioned above, sometimes the STB306 may be implemented to receive, process and decode cable and/or satellite television broadcast signals to be provided to any suitable display-enabled device, such as the SD television 304 and/or the HD television 303. Such STB306 may operate independently or in cooperation with such display-enabled device to perform image and/or video processing as described herein.
Figure 3G shows an embodiment of a Digital Video Disc (DVD) player 307. Such a DVD player may be a blu-ray DVD player, an HD enabled DVD player, an SD enabled DVD player, an upsample enabled DVD player (e.g., from SD to HD, etc.), and the like, without departing from the scope and spirit of the present invention. The DVD player may provide signals to any suitable display-enabled device, such as the SD television 304 and/or the HD television 303. The DVD player 307 may be implemented to perform image and/or video processing as described herein.
Fig. 3H illustrates an embodiment of a general digital image and/or video processing device 308. Moreover, as mentioned above, these various devices described above do not include an exhaustive list of devices that can implement the image and/or video processing described herein, and it is noted that any general purpose digital image and/or video processing device 308 can be implemented to perform the image and/or video processing described herein without departing from the scope and spirit of the present invention.
Fig. 4, 5 and 6 are diagrams illustrating various embodiments 400, 500 and 600 of video coding architectures, respectively.
Referring to the embodiment 400 of fig. 4, it can be seen in the light of this figure that an input video signal is received by a video encoder. In some embodiments, the input video signal consists of Coding Units (CUs) or Macroblocks (MBs). Such coding units or macroblocks may vary in size and may include a number of pixels, typically arranged in a square. In one embodiment, such a coding unit or macroblock has a size of 16 × 16 pixels. However, it is generally noted that a macroblock may have any desired size, such as N × N pixels, where N is an integer. Of course, while square coding units or macroblocks are utilized in the preferred embodiment, certain implementations may include non-square coding units or macroblocks.
The input video signal may be generally referred to as corresponding to original frame (or picture) image data. For example, raw frame (or picture) image data undergoes processing to produce luminance and chrominance samples. In some embodiments, the set of luma samples in a macroblock is in one particular arrangement (e.g., 16 x 16) and the set of chroma samples is in a different particular arrangement (e.g., 8 x 8). According to embodiments described herein, a video encoder processes such samples on a block-by-block basis.
Then, the input video signal is subjected to mode selection, by which the input video signal is selectively subjected to intra prediction and/or inter prediction processing. Generally, an input video signal undergoes compression along a compression path. When operating without feedback (e.g., according to neither inter-frame prediction nor intra-frame prediction), the input video signal is provided via a compression path to undergo a transform operation (e.g., according to a Discrete Cosine Transform (DCT)). Of course, other transformations may be utilized in alternative embodiments. In this mode of operation, the input video signal itself is a compressed signal. The compression path may take advantage of the lack of sensitivity of the human eye to high frequencies in performing the compression process.
However, feedback may be utilized along the compression path by selectively using inter-prediction or intra-prediction video coding. The compression path operates on a (relatively low energy) residual (e.g., difference) resulting from subtracting the current macroblock predictor from the current macroblock, according to an operational feedback or prediction mode. Depending on which prediction form is utilized in a given situation, a residual or difference between the current macroblock and a macroblock prediction value based on at least a part of the same frame (or picture) or based on at least a part of at least one other frame (or picture) is generated.
The resulting modified video signal then undergoes a transform operation along the compression path. In one embodiment, a Discrete Cosine Transform (DCT) operates on a set of video samples (e.g., luminance, chrominance, residual, etc.) to compute corresponding coefficient values for each predetermined number of base patterns. For example, one embodiment includes 64 basis functions (e.g., such as for 8 x 8 samples). In general, different implementations may utilize different numbers of basis functions (e.g., different transforms). Any combination of these corresponding basis functions, including their appropriate and selective weighting, may be used to represent a given set of video samples. More details regarding the various ways in which transform operations are performed are described in technical literature relating to video encoding, including those standards/draft standards incorporated by reference as described above. The output from the transform process includes respective coefficient values. The output is provided to a quantizer.
In general, most image blocks will typically produce coefficients (e.g., DCT coefficients in an embodiment operating according to a Discrete Cosine Transform (DCT)), such that most of the associated DCT coefficients are lower frequencies. For the reasons mentioned above and the relatively poor sensitivity of the human eye to high frequency visual effects, the quantizer is operable to convert most of the less relevant coefficients to zero values. That is, those coefficients whose relative contribution is below some predetermined value (e.g., some threshold) may be removed according to the quantization process. The quantizer is also operable to convert the significant coefficients into values that are more efficiently encoded than the values produced by the transform process. For example, the quantization process may operate by dividing each respective coefficient by an integer value and discarding any remainder. When operating on a typical coding unit or macroblock, such processing typically results in a relatively low number of non-zero coefficients that are then delivered to an entropy encoder for lossless encoding and used according to a feedback path that may select intra-prediction and/or inter-prediction processing according to video encoding.
The entropy encoder operates according to a lossless compression encoding process. In contrast, quantization operations are generally lossy. The entropy encoding process operates on the coefficients supplied from the quantization process. These coefficients may represent a variety of characteristics (e.g., luminance, chrominance, residual, etc.). Various types of encoding may be utilized by the entropy encoder. For example, Context Adaptive Binary Arithmetic Coding (CABAC) and/or Context Adaptive Variable Length Coding (CAVLC) may be performed by the entropy encoder. For example, according to at least part of the entropy coding scheme, the data is converted into (run, level) pairs ((run, level) pairing) (e.g., data 14, 3, 0, 4, 0, -3 will be converted into corresponding (run, magnitude) pairs (0, 14), (0, 3), (1, 4), (2, -3)). A table may be prepared in advance that assigns variable length codes to the value pairs such that relatively shorter length codes are assigned to relatively common value pairs and relatively longer length codes are assigned to relatively fewer common value pairs.
The reader will understand that the operations of inverse quantization and inverse transformation correspond to the operations of quantization and transformation, respectively. For example, in an embodiment where a DCT is utilized in the transform operation, then an inverse DCT (idct) is utilized in the inverse transform operation.
A picture buffer (which may alternatively be referred to as a digital picture buffer or DPB) receives signals from the IDCT module; the picture buffer operatively stores a current frame (or picture) and/or one or more other frames (or pictures), such as may be used in accordance with intra-prediction and/or inter-prediction operations (which may be performed in accordance with video coding). Note that according to intra prediction, a relatively small amount of memory may be sufficient, since it may not be necessary to store the current frame (or picture) or any other frame (or picture) within a sequence of frames (or pictures). In the case of performing intra prediction according to video coding, such stored information may be used to perform motion compression and/or motion estimation.
In one possible implementation, for motion estimation, a corresponding set of luma samples (e.g., 16 × 16) from a current frame (or picture) is compared to corresponding buffer counterparts (counters) in other frames (or pictures) within a sequence of frames (or pictures) (e.g., according to inter-prediction). In one possible implementation, the nearest matching region is located (e.g., a prediction reference) and a vector offset (e.g., a motion vector) is generated. In a single frame (or picture), many motion vectors can be found, and not all have to point in the same direction. One or more operations performed in accordance with motion estimation are operable to generate one or more motion vectors.
Motion compensation operatively utilizes one or more motion vectors that may be generated from motion estimation. A set of prediction reference samples is identified and delivered for subtraction from the original input video signal in an effort to hope to produce a relatively (e.g., ideally, more) lower energy residual. If this operation does not result in the generation of a lower energy residual, motion compensation does not have to be performed, and the transform operation may operate only on the original input video signal and not on the residual (e.g., depending on the mode of operation in which the input video signal is provided directly to the transform operation such that neither intra nor inter prediction is performed), or intra prediction may be utilized, as well as the transform operation on the residual resulting from intra prediction. In addition, if the motion estimation and/or motion compensation operations succeed, the motion vectors may also be sent to the entropy encoder along with the corresponding residual coefficients for undergoing lossless entropy encoding.
The output from the entire video encoding operation is an output bitstream. Note that the output bitstream may of course undergo some processing according to the generation of a continuous-time signal, which may be transmitted over a communication channel. For example, certain embodiments operate within a wireless communication system. In this case, the output bit stream may undergo suitable digital-to-analog conversion, frequency conversion, scaling, filtering, modulation, symbol mapping, and/or any other operation within the wireless communication device that operatively generates a continuous-time signal capable of transmission via a communication channel, and so forth.
Referring to the embodiment 500 of fig. 5, it can be seen in the light of this figure that an input video signal is received by a video encoder. In some implementations, the input video signal is composed of coding units or macroblocks (and/or is divisible into Coding Units (CUs)). The coding unit or macroblock size may vary and may include a number of pixels, typically arranged in a square. In one embodiment, the coding unit or macroblock has a size of 16 × 16 pixels. However, it is generally noted that a macroblock may have any desired size, such as N × N pixels, where N is an integer. Of course, while square coding units or macroblocks are utilized in the preferred embodiment, some implementations may include non-square coding units or macroblocks.
The input video signal may be generally referred to as corresponding to original frame (or picture) image data. For example, raw frame (or picture) image data may undergo processing to produce luma and chroma samples. In some embodiments, the set of luma samples in a macroblock is in one particular arrangement (e.g., 16 x 16) and the set of chroma samples is in a different particular arrangement (e.g., 8 x 8). According to an embodiment described herein, a video encoder processes the samples on a block-by-block basis.
Then, the input video signal is subjected to mode selection, by which the input video signal is selectively subjected to intra prediction and/or inter prediction processing. Generally, an input video signal undergoes compression along a compression path. When operating without feedback (e.g., according to neither inter-frame prediction nor intra-frame prediction), the input video signal is provided via a compression path to undergo a transform operation (e.g., according to a Discrete Cosine Transform (DCT)). Of course, other transformations may be utilized in alternative embodiments. In this mode of operation, the input video signal itself is a compressed signal. The compression path may take advantage of the lack of sensitivity of the human eye to high frequencies in performing the compression process.
However, feedback may be utilized along the compression path by selectively using inter-prediction or intra-prediction video coding. The compression path operates on a (relatively low energy) residual (e.g., difference) resulting from subtracting the current macroblock predictor from the current macroblock, according to an operational feedback or prediction mode. Depending on which prediction form is utilized in a given situation, a residual or difference between the current macroblock and a macroblock prediction value based on at least a part of the same frame (or picture) or based on at least a part of at least one other frame (or picture) is generated.
The resulting modified video signal then undergoes a transform operation along the compression path. In one embodiment, a Discrete Cosine Transform (DCT) operates on a set of video samples (e.g., luminance, chrominance, residual, etc.) to compute corresponding coefficient values for each predetermined number of base patterns. For example, one embodiment includes 64 basis functions (e.g., such as for 8 x 8 samples). In general, different implementations may utilize different numbers of basis functions (e.g., different transforms). Any combination of these corresponding basis functions, including their appropriate and selective weighting, may be used to represent a given set of video samples. More details regarding the various ways in which transform operations are performed are described in technical literature relating to video encoding, including those standards/draft standards incorporated by reference as described above. The output from the transform process includes respective coefficient values. The output is provided to a quantizer.
In general, most image blocks will typically produce coefficients (e.g., DCT coefficients in an embodiment operating according to a Discrete Cosine Transform (DCT)), such that most of the associated DCT coefficients are lower frequencies. For the reasons mentioned above and the relatively poor sensitivity of the human eye to high frequency visual effects, the quantizer is operable to convert most of the less relevant coefficients to zero values. That is, those coefficients whose relative contribution is below some predetermined value (e.g., some threshold) may be removed according to the quantization process. The quantizer is also operable to convert the significant coefficients into values that are more efficiently encoded than the values produced by the transform process. For example, the quantization process may operate by dividing each respective coefficient by an integer value and discarding any remainder. When operating on a typical coding unit or macroblock, such processing typically results in a relatively low number of non-zero coefficients that are then delivered to an entropy encoder for lossless encoding and used according to a feedback path that may select intra-prediction and/or inter-prediction processing according to video encoding.
The entropy encoder operates according to a lossless compression encoding process. In contrast, quantization operations are generally lossy. The entropy coding process operates on the coefficients provided by the quantization process. These coefficients may represent a variety of characteristics (e.g., luminance, chrominance, residual, etc.). Various types of encoding may be utilized by the entropy encoder. For example, Context Adaptive Binary Arithmetic Coding (CABAC) and/or Context Adaptive Variable Length Coding (CAVLC) may be performed by the entropy encoder. For example, according to at least part of the entropy coding scheme, the data is converted into (run, range) pairs (e.g., data 14, 3, 0, 4, 0, -3 would be converted into corresponding (run, magnitude) pairs (0, 14), (0, 3), (1, 4), (2, -3)). A table may be prepared in advance that assigns variable length codes to the value pairs such that relatively shorter length codes are assigned to relatively common value pairs and relatively longer length codes are assigned to relatively fewer common value pairs.
The reader will understand that the operations of inverse quantization and inverse transformation correspond to the operations of quantization and transformation, respectively. For example, in an embodiment where a DCT is utilized in the transform operation, then an inverse DCT (idct) is utilized in the inverse transform operation.
In some alternative embodiments, the output from the deblocking filter is provided to one or more other loop filters (e.g., implemented in accordance with a Sample Adaptive Offset (SAO) filter, an Adaptive Loop Filter (ALF), and/or any other filter type) implemented to process the output from the inverse transform block.
For example, the Adaptive Loop Filter (ALF) may be implemented to process the output from the deblocking filter, or alternatively, the ALF may be implemented to process the output from a Sample Adaptive Offset (SAO) filter, where the SAO filter first receives the output from the deblocking filter. The Adaptive Loop Filter (ALF) is applied to the decoded pictures before they are stored in a picture buffer (sometimes referred to as DPB, digital picture buffer). Adaptive Loop Filters (ALF) are implemented to reduce coding noise of decoded pictures, and their filtering can be selectively applied for luma and chroma, respectively, on a slice-by-slice basis, whether applied at a slice level or at a block level. Two-dimensional 2D Finite Impulse Response (FIR) filtering may be used in Adaptive Loop Filter (ALF) applications. The coefficients of the filter may be designed piece-by-piece at the encoder, and this information is then signaled to the decoder (e.g., signaled from a transmitter communication device that includes the video encoder (or may be referred to as an encoder) to a receiver communication device that includes the video decoder (or may be referred to as a decoder)).
One embodiment operates by generating coefficients according to a wiener filter design. Further, it is possible that whether to perform filtering is determined at the encoder on a block-by-block basis, and that decision is then signaled to the decoder based on a quadtree structure (e.g., signaled from a transmitter communication device comprising the video encoder (or which may be referred to as an encoder) to a receiver communication device comprising the video decoder (or which may be referred to as a decoder)), wherein the block size is decided according to rate distortion optimization. Note that implementations using this 2D filtering may introduce complexity in terms of both encoding and decoding. For example, by using 2D filtering implemented according to an Adaptive Loop Filter (ALF), there may be some increase in complexity within an encoder implemented within a transmitter communication device and within a decoder implemented within a receiver communication device.
In terms of one type of loop filter, the use of an Adaptive Loop Filter (ALF) according to the video processing may provide a number of arbitrary improvements, including improvements to the objective quality measure from performing peak signal-to-noise ratio (PSNR) with random quantization denoising. Furthermore, the subjective quality of the subsequently encoded video signal may be derived from illumination compensation, which may be introduced according to performing an offset process and a scaling process (e.g., applying a gain according to Finite Impulse Response (FIR) filtering) according to an Adaptive Loop Filter (ALF) process.
Receiving the output signal from the ALF is a picture buffer, which may alternatively be referred to as a digital picture buffer or DPB; the picture buffer operatively stores a current frame (or picture) and/or one or more other frames (or pictures), such as may be used in accordance with intra-prediction and/or inter-prediction operations (which may be performed in accordance with video coding). Note that according to intra prediction, a relatively small amount of memory may be sufficient, since it may not be necessary to store the current frame (or picture) or any other frame (or picture) within a sequence of frames (or pictures). In the case of performing intra prediction according to video coding, such stored information may be used to perform motion compression and/or motion estimation.
In one possible implementation, for motion estimation, a respective set of luma samples (e.g., 16 x 16) from a current frame (or picture) is compared to respective buffer counterparts in other frames (or pictures) within a sequence of frames (or pictures) (e.g., according to inter-frame prediction). In one possible implementation, the closest matching region (e.g., prediction reference) is located and a vector offset (e.g., motion vector) is generated. In a single frame (or picture), many motion vectors can be found, and not all have to point in the same direction. One or more operations performed in accordance with motion estimation are operable to generate one or more motion vectors.
Motion compensation operatively utilizes one or more motion vectors that may be generated from motion estimation. A set of prediction reference samples is identified and delivered for subtraction from the original input video signal in an effort to hope to produce a relatively (e.g., ideally, more) lower energy residual. If this operation does not result in the generation of a lower energy residual, motion compensation does not have to be performed, and the transform operation may operate only on the original input video signal and not on the residual (e.g., depending on the mode of operation in which the input video signal is provided directly to the transform operation such that neither intra nor inter prediction is performed), or intra prediction may be utilized, as well as the transform operation on the residual resulting from intra prediction. In addition, if the motion estimation and/or motion compensation operations succeed, the motion vectors may also be sent to the entropy encoder along with the corresponding residual coefficients for undergoing lossless entropy encoding.
The output from the entire video encoding operation is an output bitstream. Note that the output bit stream may of course undergo some processing in accordance with the generation of a continuous-time signal, which may be transmitted over a communication channel. For example, certain embodiments operate within a wireless communication system. In this case, the output bit stream may undergo suitable digital-to-analog conversion, frequency conversion, scaling, filtering, modulation, symbol mapping, and/or any other operation within the wireless communication device that operatively generates a continuous-time signal capable of transmission via a communication channel, and so forth.
Referring to the embodiment 600 of fig. 6, in the context of this figure, an alternative embodiment of a video encoder that performs prediction, transform, and encoding processes to produce a compressed output bitstream is depicted. The video encoder may operate in accordance with and conform to one or more video coding protocols, standards and/or recommended practices, such as ISO/IEC14496-10-MPEG-4part10, AVC (advanced video coding), or may be referred to as h.264/MPEG-4part10 or AVC (advanced video coding), itu h.264/MPEG 4-AVC.
Note that a corresponding video decoder, such as one located at the other end of the communication channel, is operative to perform complementary processing of decoding, inverse transformation, and reconstruction to generate (ideally) a corresponding decoded video sequence representing the input video signal.
As can be seen with this figure, alternative configurations and architectures may be used to implement video coding. In general, an encoder processes an input video signal (e.g., typically composed in units of coding units or macroblocks, often square and including N × N pixels). Video coding determines a current macroblock prediction based on previously encoded data. The previously encoded data may originate from the current frame (or picture) itself (e.g., such as from intra-prediction), or from one or more other frames (or pictures) that have been encoded (e.g., such as from inter-prediction). The video encoder subtracts the current macroblock prediction to form a residual.
In general, intra prediction operations utilize one or more block sizes of a particular size (e.g., 16 × 16, 8 × 8, or 4 × 4) to predict a current macroblock from surrounding, previously encoded pixels in the same frame (or picture). In general, inter prediction operates with a range of block sizes (e.g., 16 × 16 down to 4 × 4) to predict pixels in a current frame (or picture) from regions selected from within one or more previously encoded frames (or pictures).
In terms of transform and quantization operations, the residual sample block may undergo a transform using a particular transform (e.g., 4 × 4 or 8 × 8). One possible implementation of this transform operates according to the Discrete Cosine Transform (DCT). The transform operation outputs a set of coefficients such that each respective coefficient corresponds to a respective weighting value for one or more basis functions associated with the transform. After undergoing the transform, the transform coefficient blocks are quantized (e.g., each respective coefficient may be divided by an integer value and any associated remainder may be discarded, or they may be multiplied by an integer value). The quantization process is typically inherently lossy and it can reduce transform coefficient precision according to the Quantization Parameter (QP). Typically, many of the coefficients associated with a given macroblock are zero, and only some non-zero coefficients remain. Generally, a relatively high QP setting operatively results in a larger proportion of zero-valued coefficients and a smaller magnitude for non-zero coefficients, resulting in relatively high compression (e.g., a relatively low encoding bitrate) at the expense of relatively poor decoded image quality; the relatively low QP setting operatively allows more non-zero coefficients and larger magnitudes of non-zero coefficients to be retained after quantization, resulting in relatively lower compression (e.g., relatively higher encoding bit rates) with relatively better decoded image quality.
The video encoding process produces a number of values (which are encoded to form a compressed bitstream). Examples of such values include: quantized transform coefficients, information to be used by the decoder to recreate the appropriate prediction, information about the compression tools utilized during encoding and the structure of the compressed data, information about the complete video sequence, and so forth. The values and/or parameters (e.g., syntax elements) may undergo encoding within an entropy encoder operating according to CABAC, CAVLC, or some other entropy encoding scheme to produce an output bitstream that may be stored, transmitted (e.g., after undergoing appropriate processing, to produce a continuous-time signal suitable for a communication channel), and so forth.
In embodiments that operate using a feedback path, the transformed and quantized output undergoes inverse quantization and inverse transformation. One or both of intra-prediction and inter-prediction may be performed according to video encoding. In addition, motion compensation and/or motion estimation may be performed based on the video encoding.
The signal path output from the inverse quantization and inverse transform (e.g., IDCT) block, which is provided to the intra prediction block, is also provided to the deblocking filter. The output from the deblocking filter is provided to one or more other loop filters (e.g., implemented in accordance with a Sample Adaptive Offset (SAO) filter, an Adaptive Loop Filter (ALF), and/or any other filter type) that are implemented to process the output from the inverse transform block. For example, in one possible implementation, ALF is applied to a decoded picture before the decoded picture is stored in a picture buffer (also sometimes may be referred to as DPB, digital picture buffer). ALF is implemented to reduce coding noise of decoded pictures, and whether applied at the slice level or at the block level, its filtering can be selectively applied for luma and chroma, respectively, on a slice-by-slice basis. Two-dimensional 2D Finite Impulse Response (FIR) filtering may be used in ALF applications. The coefficients of the filter may be designed piece-by-piece at the encoder, and this information is then signaled to the decoder (e.g., signaled from a transmitter communication device that includes the video encoder (or may be referred to as an encoder) to a receiver communication device that includes the video decoder (or may be referred to as a decoder)).
One embodiment generates coefficients according to a wiener filter design. Further, it is possible that whether to perform filtering is determined at the encoder on a block-by-block basis, and that decision is then signaled to the decoder based on a quadtree structure (e.g., signaled from a transmitter communication device comprising the video encoder (or which may be referred to as an encoder) to a receiver communication device comprising the video decoder (or which may be referred to as a decoder)), wherein the block size is decided according to rate distortion optimization. Note that implementations using this 2D filtering may introduce complexity in terms of both encoding and decoding. For example, by using 2D filtering implemented according to ALF, there may be some increase in complexity of the encoder implemented in the transmitter communication device and the decoder implemented in the receiver communication device.
As mentioned in other embodiments, the use of ALF may provide many arbitrary improvements, including improvements to the peak signal-to-noise ratio (PSNR) objective quality measure from performing random quantization denoising. Furthermore, the subjective quality of the subsequently encoded video signal may be derived from illumination compensation, which, according to the ALF processing, may be introduced according to performing an offset processing and a scaling processing (e.g., applying a gain according to FIR filtering).
With respect to any video encoder architecture implemented to generate an output bitstream, it is noted that the architecture can be implemented within any of a variety of communication devices. The output bitstream may undergo additional processing, including Error Correction Codes (ECC), Forward Error Correction (FEC), etc., to produce a corrected output bitstream having additional redundant processing therein. Further, with respect to the digital signal, it will be appreciated that the digital signal may undergo any suitable processing in accordance with the generation of a continuous-time signal suitable or adapted for transmission via a communication channel. That is, the video encoder architecture may be implemented within a communication device that is operable to perform one or more signal transmissions via one or more communication channels. Additional processing may be performed on the output bitstream produced by the video encoder architecture to produce a continuous-time signal that may enter the (launchingto) communication channel.
Fig. 7 is a diagram illustrating an embodiment 700 of an intra prediction process. As can be seen with respect to this figure, a current block of video data (e.g., often square and generally comprising N × N pixels) undergoes processing to estimate the corresponding pixels therein. According to this intra prediction, previously encoded pixels located above and to the left of the current block are utilized. From some perspectives, the intra prediction direction may be considered to correspond to a vector extending from the current pixel to a reference pixel located above or to the left of the current pixel. The details of intra-prediction applied to coding according to H.264/AVC are specified within the corresponding standards incorporated by reference above (e.g. International telecommunication Union, ITU-T, telecommunication standardization sector of ITU, H.264(03/2010), H series: AUDIOVISUALANDMULTIMEDIASYSTEM, Infrastructure of audio services-codingmovingvideo, advanced coding for audio services, RecommendationITU-TH.264, alternatively also known as International Telecommunications ISO/IEC14496-10-Mpeg-4Part10, (advanced coding), H.264/MPEG-4Part10 or AVC (advanced coding), UH.264/MPEG-4 Part 4-AVC equivalents).
The residual (i.e., the difference between the current pixel and the reference pixel or the prediction pixel) is the encoded residual. As can be seen in the present figure, intra prediction operates using common intra frame (or picture) pixels. Of course, note that a given pixel may have different respective components associated with it, and that each respective component may have a different respective sample set.
Fig. 8 is a diagram illustrating an embodiment 800 of inter prediction processing. In contrast to intra-prediction, inter-prediction operates to identify motion vectors (e.g., inter-prediction directions) based on a current set of pixels within a current frame (or picture) and one or more reference or prediction sets of pixels located within one or more other frames (or pictures) within a sequence of frames (or pictures). It can be seen that the motion vector extends from the current frame (or picture) to another frame (or picture) within the sequence of frames (or pictures). Inter prediction may utilize sub-pixel interpolation such that the predicted pixel values correspond to a function of multiple pixels in a reference frame or picture.
Although the residual may be calculated according to the inter prediction process, the residual is different from the residual calculated according to the intra prediction process. The residual at each pixel, in turn, corresponds to the difference between the current pixel and the predicted pixel value, according to the inter prediction process. However, according to the inter prediction process, the current pixel and the reference pixel or the prediction pixel are not within the same frame (or picture). Although the figure illustrates utilizing inter prediction for one or more previous frames or pictures, it is also noted that alternative embodiments may operate using reference frames corresponding to frames before and/or after the current frame. For example, many frames may be stored according to appropriate buffering and/or memory management. When operating on a given frame, reference frames may be generated from other frames before and/or after the given frame.
Coupled with the CU, the base unit may be used for prediction partition mode, i.e., prediction unit or PU. Note also that PUs are defined only for the last depth CU, and their respective sizes are limited to CU sizes.
Fig. 9 and 10 are diagrams illustrating various embodiments 900 and 1000, respectively, of a video decoding architecture.
Generally, the video decoding architecture operates on an input bitstream. Of course, note that the input bit stream may be generated from a signal received by the communication device from a communication channel. Various operations may be performed on the continuous-time signal received from the communication channel, including digital sampling, demodulation, scaling, filtering, etc., such as may be suitable for generating an input bit stream in accordance therewith. Further, certain embodiments in which one or more types of Error Correction Codes (ECCs), Forward Error Correction (FEC), etc., may be implemented may perform appropriate decoding in accordance with the ECC, FEC, etc., to produce an input bitstream. That is, in some embodiments where additional redundancy has been made in accordance with generating a corresponding output bitstream (e.g., such as may be entered from a transmitter communication device or from a transmitter portion of a transceiver communication device), appropriate processing may be performed in accordance with generating the input bitstream. In general, the video decoding architecture is implemented to process an input bitstream to produce an output video signal corresponding (as close as possible, ideally perfect) to the original input video signal for output to one or more video display enabling devices.
Referring to the embodiment 900 of fig. 9, in general, a decoder, such as an entropy decoder (which may be implemented according to CABAC, CAVLC, etc., for example), processes an input bitstream in accordance with complementary processing to perform the encoding (as done within a video encoder architecture). The input bitstream can be viewed as (as close as possible, ideally perfect) a compressed output bitstream produced by the video encoder architecture. Of course, in real life applications, it is possible that some errors may occur in the signals transmitted via one or more communication links. The entropy decoder processes the input bitstream and extracts appropriate coefficients, such as DCT coefficients (e.g., such as information representing chrominance, luminance, etc.), and provides the coefficients to the inverse quantization and inverse transform block. In the case of utilizing a DCT transform, the inverse quantization and inverse transform block may be implemented to perform an inverse DCT (idct) operation. The a/D blocking filter is then implemented to produce individual frames and/or pictures corresponding to the output video signal. These frames and/or pictures may be provided to a picture buffer or Digital Picture Buffer (DPB) for performing other operations including motion compensation. In general, the motion compensation operation may be considered to correspond to inter prediction associated with video coding. In addition, intra prediction may also be performed on the signal output from the inverse quantization and inverse transform block. In terms of video encoding, similarly, upon decoding an input bitstream, the video decoder architecture may be implemented to perform mode selection between neither intra nor inter, inter or intra prediction, thereby generating an output video signal.
Referring to embodiment 1000 of fig. 10, in some alternative embodiments, one or more loop filters, such as may be implemented in accordance with video encoding (e.g., in accordance with a Sample Adaptive Offset (SAO) filter, an Adaptive Loop Filter (ALF), and/or any other filter type), are used to generate the output bitstream, and the respective one or more loop filters may be implemented within a video decoder architecture. In one embodiment, one or more suitable implementations of the loop filter follow the deblocking filter.
Fig. 11 illustrates an embodiment 1100 of a table representing binarization of macroblock quantizer value change step syntax elements for Context Adaptive Binary Arithmetic Coding (CABAC) and Context Adaptive Variable Length Coding (CAVLC) entropy coding.
According to video coding such as that performed according to AVC/h.264 (e.g., incorporated by reference above), the encoder can perform at two different entropy coding methods: selecting between Context Adaptive Binary Arithmetic Coding (CABAC) and Context Adaptive Variable Length Coding (CAVLC). CABAC provides high compression efficiency but is relatively more computationally complex than CAVLC. CAVLC provides relatively low complexity, but generally and also provides less compression than CABAC. According to certain embodiments of the video processing architecture, separate and corresponding circuits, hardware, software, processing modules, components, etc. may be implemented for performing corresponding video coding operations according to CABAC and CAVLC. That is, due to the inherently different differences between CABAC and CAVLC, different respective circuits, hardware, software, processing modules, components, and the like are provided for each CABAC and CAVLC.
In general, CAVLC and CABAC both use separate and different variable length binary codes to represent syntax elements. For example, the binarization/codeword for a macroblock quantizer value change step syntax element (mb _ qp _ delta) is shown in fig. 11. It can be seen that for very same macroblock quantizer value change step syntax elements, each CABAC and CAVLC will produce different corresponding values. In the case of CABAC, bins are generated, and in the case of CAVLC, bits are generated. That is, in the case of CABAC, the resulting CABAC bins are then subjected to arithmetic coding to produce bits, which are then inserted into the compressed bitstream. In other words, the codeword used to represent the CAVLC syntax element is simply referred to as a bit (bit). These bits are inserted directly into the compressed bitstream. The binarization of CABAC syntax elements is called bins (bins). The bins are further compressed using an adaptive arithmetic coding method to produce CABAC bits, which are inserted into the compressed bitstream.
Fig. 12 shows an embodiment 1200 of separate and corresponding architectures for CABAC and CAVLC entropy coding, respectively. It can be seen that according to CABAC entropy coding, the corresponding syntax elements undergo binarization, producing corresponding CABAC bins, and these CABAC bins are then subjected to arithmetic coding to produce CABAC bits, which are inserted into the compressed bitstream.
In the case of CAVLC entropy coding, the syntax elements undergo Variable Length Coding (VLC) encoding to produce CAVLC bits, which are inserted into the compressed bitstream.
Fig. 13 shows an embodiment 1300 of separate and corresponding architectures for CABAC/CAVLC entropy decoding, respectively. The inverse operation performed in this figure is for the previous figure 12. For example, in the case of CABAC entropy decoding, bits are received and subjected to arithmetic decoding, resulting in CABAC bins, which are then subjected to bin decoding, resulting in syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are the same as those syntax elements that have been produced according to CABAC entropy encoding.
With CAVLC entropy decoding, bits are received and subjected to Variable Length Coding (VLC) decoding, resulting in syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced from CAVLC entropy decoding are the same as those already produced from CAVLC entropy encoding.
The variable length coding method used in CAVLC and the binarization used in CABAC are very similar processes. They simply map the syntax element values to binary strings 1 and 0. However, it will be found from the above example that CABAC and CAVLC are not the same. In fact, CAVLC syntax is typically very different when compared to CABAC.
The reader should understand that there is a difference between CABAC and CAVLC coding methods. In general, typical methods (by which the video coding device is designed) implement separate and corresponding circuits, hardware, software, processing modules, components, real estate, die size, and/or processing resources, etc., for each CABAC and CAVLC coding operation. However, although there are some differences between CABAC and CAVLC encoding methods, a properly designed device can be constructed in which some commonality between the two respective encoding methods can be made. That is, properly designed processing (whereby the respective CABAC and CAVLC encoding methods share at least some common characteristics) may provide the ability to extend entropy encoding complexity in certain applications. For example, by designing an architecture in which CABAC and CAVLC encoding methods share an appropriate amount of processing, a reduction in circuitry, hardware, software, processing modules, components, real estate, die size, and/or processing resources, etc., may be achieved by sharing common components and architectures for both CABAC and CAVLC encoding methods.
For example, most video processing devices may include functions and capabilities that support both CABAC and CAVLC encoding methods, where the video processing device most operatively performs CABAC and CAVLC entropy encoding and CABAC and CAVLC entropy decoding. For a hardware decoder implemented and operating to support CABAC and CAVLC entropy decoding, the requirements to support both types of decoding (i.e., CABAC and CAVLC entropy decoding) typically require dedicated hardware blocks for parsing the CABAC and CAVLC syntax. That is, separate and distinct hardware blocks are implemented for supporting CAVLC and CABAC syntax, respectively. While this dual functionality may be included within certain video processing devices that are implemented to support and comply with AVC, resource usage within the video processing devices is relatively inefficient. Furthermore, this dual functionality can unfortunately increase the cost, complexity, grain size, real estate, processing resources, etc. of video processing devices that are implemented to support mutually independent and different CABAC and CAVLC encoding methods.
Herein, a unified CABAC/CAVLC encoding architecture and/or methodology is presented that may provide for scalable complexity within a video processing device. For example, in one video coding standard currently being developed, such as HEVC/h.265 (e.g., incorporated by reference above), the unified CABAC/CAVLC coding architecture may be utilized such that a single entropy coding architecture and/or method may provide scalable complexity. In particular, the unified CABAC/CAVLC coding architecture and/or methodology according to the various aspects and equivalents thereof of the present invention may implement a variety of options to provide a variety of designs and tradeoffs between efficiency and complexity without requiring duplication of any possible implementation of two completely separate entropy coding architectures and/or methodologies, including circuitry, hardware, software, processing modules, components, real estate, die size, and/or processing resources, etc.
As can be seen in the above fig. 11, different respective values are generated according to CABAC entropy coding, in particular in terms of bins, and CAVLC entropy coding, in particular in terms of bits. In some embodiments, a completely different binarization may be utilized according to CABAC entropy coding. That is, instead of utilizing the specific binarization currently utilized according to CABAC entropy coding, a differently implemented variable length binarization or a variant thereof may be utilized. It should be understood with respect to other illustrations and/or embodiments presented herein that commonality can be achieved with respect to CAVLC entropy coding by utilizing a suitably designed combination of variable length binarization and arithmetic coding according to CABAC entropy coding. It should be appreciated in view of the other illustrations and/or embodiments presented herein that the designer is provided with great freedom in implementing variations of the unified CABAC/CAVLC encoding architecture and/or methodology in accordance with aspects of the present invention and equivalents thereof.
Fig. 14 shows an embodiment 1400 of separate and corresponding architectures for CABAC and CAVLC entropy coding, respectively, and in particular, where Variable Length Coding (VLC) coding is utilized.
When comparing the embodiment 1400 with the embodiment 1200 of fig. 12, it can be seen that the binarization process employed according to CABAC entropy coding is replaced by Variable Length Coding (VLC) coding, which is employed according to CAVLC entropy coding in the embodiment 1200 of fig. 12. That is, in embodiment 1400, very same Variable Length Coding (VLC) encoding is used for both CABAC entropy encoding and CAVLC entropy encoding. When CABAC entropy encoding is performed, the bins generated according to Variable Length Coding (VLC) encoding are then subjected to arithmetic encoding, thereby generating CABAC bits, which are inserted into the compressed bitstream.
The operations and processes shown at the bottom of this figure are similar and analogous to those performed in the embodiment 1200 of fig. 12 in terms of performing CAVLC encoding. That is, with CAVLC entropy encoding in implementation 1400, the syntax elements undergo VLC encoding, producing CAVLC bits that are inserted into the compressed bitstream.
From some perspectives, the embodiment 1400 shown with respect to fig. 14 can be understood to use VLC coding from AVC, rather than binarization as commonly utilized in accordance with CABAC coding. In light of the other illustrations and/or embodiments presented herein, the reader should properly understand that different respective embodiments of variable length binarization may alternatively be utilized. I.e. using VLC coding from AVC rather than binarization as is typically utilized according to CABAC coding, is one possible implementation, but many other implementations, variations, equivalents, etc. may alternatively be utilized, if desired.
Fig. 15 shows an embodiment 1500 of separate and corresponding architectures for CABAC and CAVLC entropy decoding, respectively, and in particular where VLC decoding is utilized. When comparing the embodiment 1500 with the embodiment 1300 of fig. 13, it can be seen that the bin decoding according to CABAC entropy decoding is replaced by Variable Length Coding (VLC) decoding, which is utilized according to CAVLC entropy decoding in the embodiment 1300 of fig. 13. That is, in embodiment 1500, very same VLC decoding is utilized for both CABAC entropy decoding and CAVLC entropy decoding. When CABAC entropy decoding is performed, the bins produced according to the arithmetic decoding are then subjected to VLC decoding, producing syntax elements that have been produced according to CABAC entropy encoding. It is also mentioned with respect to other illustrations and/or embodiments that ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are identical to those syntax elements that have been produced according to CABAC entropy encoding.
The operations and processes shown at the bottom of this figure are similar and analogous to those performed in the embodiment 1300 of fig. 13, in terms of performing CAVLC decoding. That is, with CAVLC entropy decoding in implementation 1500, the bits undergo VLC decoding, resulting in syntax elements that have been generated according to CAVLC entropy encoding. It is also mentioned with respect to other illustrations and/or embodiments that ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are identical to those syntax elements that have been produced according to CABAC entropy encoding.
As can be seen with respect to the embodiment 700 of fig. 7 and the embodiment 1500 of fig. 15, different respective implementations of certain functional blocks and/or components are implemented in respective CABAC and CAVLC encoding architectures and/or methods. Given the commonality and similarity between various implementations of CABAC and CAVLC coding architectures and/or methods, certain subsequent embodiments and/or illustrations operate to combine and share the functional blocks and/or components. For example, it is not necessary to utilize different and separately implemented functional blocks and/or components for both CABAC and CAVLC coding architectures and/or methods. Due to some commonality and similarity between various implementations of CABAC and CAVLC coding architectures and/or methods, savings can be made in circuitry, hardware, software, processing modules, components, real estate, grain size, and/or processing resources, as many are used to support both CABAC and CAVLC coding.
Fig. 16A shows an embodiment 1600 of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, where Variable Length Coding (VLC) coding is utilized. As can be seen with the embodiment 1600 of fig. 16A, VLC coding is utilized and shared for both CABAC and CAVLC entropy coding. The syntax elements undergo VLC coding. According to CAVLC coding, VLC coded syntax elements are provided as CAVLC bits, which are inserted into the compressed bitstream. However, according to CABAC coding, VLC coded syntax elements are then subjected to arithmetic coding, resulting in CABAC bits, which are inserted into the compressed bitstream. As can be seen with embodiment 1600, shared functional blocks and/or components may be utilized for supporting both CABAC and CAVLC entropy coding.
Fig. 16B shows an embodiment 1601 of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where Variable Length Coding (VLC) decoding is utilized. As can be seen with the embodiment 1601 of fig. 16B, VLC decoding is utilized and shared for both CABAC and CAVLC entropy decoding. Depending on the particular implementation utilized, either CABAC bits or CAVLC bits are received by the entropy decoding architecture and/or method. According to CAVLC entropy decoding, the received CAVLC bits undergo VLC decoding, resulting in a syntax element. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced from CAVLC entropy decoding are the same as those already produced from CAVLC entropy encoding.
According to CABAC entropy decoding, the received CABAC bits undergo arithmetic decoding, producing bins, which are then subjected to VLC decoding, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are the same as those syntax elements that have been produced according to CABAC entropy encoding.
As can be seen with embodiments 1600 and 1601, commonly utilized functional blocks and/or components are utilized for both CABAC and CAVLC entropy coding. In the present case, in embodiment 1600 VLC coding is utilized for both CABAC and CAVLC entropy coding; in embodiment 1601, VLC decoding is utilized for both CABAC and CAVLC entropy decoding. Properly designed and commonly shared VLC encoding and VLC decoding enables more efficient implementation of architectures and/or methods operable to support both CABAC and CAVLC entropy encoding. Although embodiments 1600 and 1601 utilize VLC encoding and VLC decoding as used in CABAC and CAVLC entropy encoding, it is noted that other implementations utilizing commonly shared functional blocks and/or components may be substituted in the unified implementation of binarization for CABAC/CAVLC entropy encoding. That is, embodiments 1600 and 1601, as shown in fig. 16A and 16B, respectively, utilize common VLC encoding and VLC decoding that support CABAC/CAVLC entropy encoding, respectively, and in other embodiments alternative forms of variable length binarization may be utilized.
Fig. 17A shows an implementation 1700 of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, where variable length binarization is utilized. Implementation 1700 generally depicts variable length binarization utilized jointly for both CABAC and CAVLC entropy encoding. As can be seen in fig. 17A, the syntax elements undergo variable length binarization for both CABAC and CAVLC entropy coding. However, in the case of CABAC coding, syntax elements that have undergone variable length binarization (thereby generating bins) are then subjected to arithmetic coding, thereby generating CABAC bits, which are inserted into the compressed bitstream. Alternatively, in the case of CAVLC coding, the syntax elements that have undergone variable length binarization are in fact CAVLC bits, which are inserted into the compressed bitstream.
As can be appreciated with respect to the embodiment 1700 of fig. 17A, a single unified binarization is utilized for both CABAC and CAVLC entropy coding. For example, given the similarity between CABAC binarization and CAVLC variable length coding, generalized, single unified binarization may be utilized for both CABAC and CAVLC entropy coding. Further, note that one possible implementation of this single unified binarization would be to encode with VLC utilized in accordance with AVC. However, in addition to the implementation utilized according to AVC, an alternative variant of this single unified binarization may be utilized instead.
Fig. 17B shows an implementation 1701 of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where variable length bin decoding is utilized. Implementation 1701 generally depicts variable length bin decoding utilized in common for both CABAC and CAVLC entropy decoding. As can be seen with respect to fig. 17B, variable length bin decoding is utilized and shared for both CABAC and CAVLC entropy decoding. Depending on the particular implementation being utilized, either CABAC bits or CAVLC bits are received by the entropy decoding architecture and/or method. According to CAVLC entropy decoding, the received CAVLC bits undergo variable length bin decoding, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced from CAVLC entropy decoding are the same as those already produced from CAVLC entropy encoding.
According to CABAC entropy decoding, the received CABAC bits undergo arithmetic decoding, producing bins, which are then subjected to variable length bin decoding, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are the same as those syntax elements that have been produced according to CABAC entropy encoding.
According to future video coding standard developments, including currently under development standards such as HEVC, properly designed variable length binarization may be utilized so that the output (bin) of CABAC decompression will use exactly the same syntax as CAVLC. In this way, the encoder can choose whether to provide high efficiency or low complexity, but VLC decoding will only require a single syntax parsing engine. It can be seen that by providing this architecture and/or methodology that operatively supports both CABAC and CAVLC entropy coding, a single syntax parsing engine can be utilized.
Note that when high-efficiency encoding is required, the entropy encoder may perform both binarization and arithmetic encoding. Arithmetic coding is simply skipped or bypassed when low complexity is required. The binary code used in both cases (CABAC and CAVLC entropy coding) is the same, so no additional syntax parser is needed.
The complexity distribution between the adaptive arithmetic coding and binary steps may also be modified to better match the novel architectures and/or methods presented herein. For example, designers are given great latitude in modifying the complexity of variable length binarization and arithmetic coding. For example, if CABAC entropy coding requires fixed compression, the degree to which the variable length binarization complexity is modified will guide a corresponding modification of the arithmetic coding complexity. For example, if the variable length binarization complexity is increased to a level, the arithmetic coding complexity will be reduced to a corresponding level, so that the overall compression according to CABAC entropy coding will remain unchanged. Alternatively, if the variable length binarization complexity is reduced to a level, the arithmetic coding complexity will be increased to a corresponding level, so that the overall compression according to CABAC entropy coding will remain unchanged. This implementation is described in terms of certain other embodiments and/or illustrations presented herein. It should also be understood that the overall compression of both CABAC and CAVLC entropy coding may be increased or decreased depending on the particular implementation required for any given application.
Considering the implementation with CABAC binarization as defined according to AVC, it is noted that CABAC binarization as defined in AVC may itself be seen as providing somewhat relatively weak compression compared to the overall compression provided by CABAC entropy coding and/or CAVLC entropy coding. That is, the binarization operation according to CABAC entropy coding (such as with embodiment 1200 of fig. 12) may be considered to provide relatively weak compression compared to other overall entropy coding schemes. Because CABAC bins pass through additional arithmetic encoding steps (again, such as may be seen with respect to embodiment 1200 of fig. 12), the particular binary codes used for AVCCABAC need not be particularly efficient, provided they can rely on the additional compression provided by the arithmetic encoding steps. CAVLC, on the other hand, provides fairly efficient variable length coding. While CAVLC entropy coding may be relatively less efficient than CABAC coding in its entirety (e.g., when arithmetic coding is used), CAVLC (and particularly its VLC coded components) may be acceptable in certain applications assuming its complexity reduction.
In a unified binarization scheme, such as according to the different aspects of the invention presented herein and their equivalents, the arithmetic coding context adaptation may be slightly weakened, assuming that stronger binary codes are used in the front. It will also be appreciated with respect to other embodiments and/or figures presented herein that depending on the ticket application, an appropriate decision of a suitable trade-off between increased compression efficiency and complexity may be selected when using the redefined and/or modified CABAC binarization. For example, in embodiments intended to provide the same overall compression, relatively stronger binarization may be combined with a relatively or slightly weaker adaptive arithmetic encoder to provide the same overall performance as existing CABAC approaches without increasing overall complexity.
Several subsequent figures and/or embodiments depict certain trade-offs that may be made in terms of increasing or decreasing variable length binarization complexity and/or increasing or decreasing arithmetic coding complexity according to CABAC/CAVLC entropy coding. Furthermore, several of the following figures and/or embodiments depict certain trade-offs that may be made in terms of increasing or decreasing variable length bin decoding complexity and/or increasing or decreasing arithmetic decoding complexity according to CABAC/CAVLC entropy decoding.
Fig. 18A shows an embodiment 1800 of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, arithmetic coding in which variable length binarization is employed with complexity increase, and also in particular in conjunction with complexity reduction for CABAC entropy coding. Implementation 1800 generally depicts variable length binarization for a relative increase in complexity that is utilized in common for both CABAC and CAVLC entropy coding. As can be seen in fig. 18A, the syntax elements undergo variable length binarization for both CABAC and CAVLC entropy coding with a relatively increased complexity. However, in the case of CABAC coding, syntax elements that have undergone variable length binarization (thereby producing bins) of relatively increased complexity are subsequently subjected to arithmetic coding of relatively reduced complexity, thereby producing CABAC bits that are inserted into the compressed bitstream. Alternatively, in the case of CAVLC coding, the syntax elements that have undergone variable length binarization with a relatively increased complexity are in fact CAVLC bits, which are inserted into the compressed bitstream.
It can be seen that certain trade-offs can be made in terms of variable length binarization and arithmetic coding according to CABAC/CAVLC entropy coding when it is desired to maintain relatively fixed CABAC compression. In general, as the complexity of variable length binarization increases, then the arithmetic coding complexity should decrease accordingly. Alternatively, as the variable length binarization complexity decreases, then the arithmetic coding complexity should increase accordingly. Furthermore, with this embodiment, corresponding increases/decreases in complexity are made with respect to different functional blocks and/or components to ensure relatively fixed CABAC compression. Of course, as can be seen with the embodiment 1800 of fig. 18A, when CAVLC entropy encoding is performed, variable length binarization, which is of relatively increased complexity, will provide relatively greater compression than compression according to typical CAVLC entropy encoding.
Fig. 18B shows an embodiment 1801 of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, variable length bin decoding with increased complexity according to CABAC entropy decoding, and arithmetic decoding also in particular incorporating reduced complexity for CABAC entropy decoding. The operations in the embodiment 1801 of fig. 18B may be properly understood as complementary or inverse operations to those shown in the embodiment 1800 of fig. 18A.
Implementation 1801 generally depicts variable length bin decoding of relatively increased complexity that is utilized in common for both CABAC and CAVLC entropy decoding. As can be seen with respect to fig. 18B, variable length bin decoding with and sharing a relative increase in complexity is utilized for both CABAC and CAVLC entropy decoding. Depending on the particular implementation being utilized, either CABAC bits or CAVLC bits are received by the entropy decoding architecture and/or method. According to CAVLC entropy decoding, the received CAVLC bits undergo variable length bin decoding of relatively increased complexity, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced from CAVLC entropy decoding are the same as those already produced from CAVLC entropy encoding.
According to CABAC entropy decoding, the received CABAC bits undergo arithmetic decoding of relatively reduced complexity, resulting in bins, which are then subjected to variable length bin decoding of relatively increased complexity, resulting in syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are the same as those syntax elements that have been produced according to CABAC entropy encoding.
Fig. 19A shows an implementation 1900 of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, variable length binarization with increased complexity. Implementation 1900 generally depicts variable length binarization for relatively increased complexity that is utilized in common for both CABAC and CAVLC entropy coding. As can be seen in fig. 19A, the syntax elements undergo variable length binarization for both CABAC and CAVLC entropy coding with a relatively increased complexity. However, in the case of CABAC coding, syntax elements that have undergone variable length binarization (thereby generating bins) of relatively increased complexity are subsequently subjected to arithmetic coding, thereby generating CABAC bits, which are inserted into the compressed bitstream. Alternatively, in the case of CAVLC coding, the syntax elements that have undergone variable length binarization with a relatively increased complexity are in fact CAVLC bits (which are inserted into the compressed bitstream).
As can be seen in the present figure, while the complexity of variable length binarization is relatively increased, the arithmetic coding complexity utilized according to CABAC entropy coding is not necessarily relatively reduced. That is, the arithmetic coding according to CABAC entropy coding may not be changed. Thus, it should be appreciated that based on the increased complexity of variable length binarization utilized for both CABAC and CAVLC entropy coding, the overall compression for both will correspondingly increase.
Fig. 19B shows an embodiment 1901 of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where variable length bin decoding with increased complexity is utilized in accordance with CABAC entropy decoding. The operations in the embodiment 1901 of fig. 19B may be properly understood as complementary or inverse operations to those shown in the embodiment 1900 of fig. 19A.
Implementation 1901 generally depicts variable length bin decoding of relatively increased complexity that is utilized jointly for both CABAC and CAVLC entropy decoding. As can be seen with respect to fig. 19B, variable length bin decoding with and sharing a relative increase in complexity is utilized for both CABAC and CAVLC entropy decoding. Depending on the particular implementation being utilized, either CABAC bits or CAVLC bits are received by the entropy decoding architecture and/or method. According to CAVLC entropy decoding, the received CAVLC bits undergo variable length bin decoding of relatively increased complexity, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced from CAVLC entropy decoding are the same as those already produced from CAVLC entropy encoding.
According to CABAC entropy decoding, the received CABAC bits undergo arithmetic decoding, producing bins, which are then subjected to variable length bin decoding of relatively increased complexity, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are the same as those syntax elements that have been produced according to CABAC entropy encoding.
As can be seen in the context of the present figure, while the complexity of variable length bin decoding is relatively increased, the complexity of arithmetic decoding utilized in accordance with CABAC entropy decoding is not necessarily relatively reduced. That is, arithmetic decoding according to CABAC entropy decoding may not be changed. To this end, it should be understood that the overall compression of the bits received from both CABAC and CAVLC entropy decoding will correspond to a relatively increased compression; therefore, the variable length binarization complexity utilized for both CABAC and CAVLC entropy decoding will also correspondingly be relatively increased.
Fig. 20A shows an implementation 2000 of a unified architecture for both CABAC and CAVLC entropy coding, and in particular, where variable length binarization with reduced complexity is utilized. Implementation 2000 generally depicts relatively reduced complexity variable length binarization for both CABAC and CAVLC entropy coding. As can be seen in fig. 20A, the syntax elements undergo variable length binarization for relatively reduced complexity for both CABAC and CAVLC entropy coding. However, in the case of CABAC coding, syntax elements that have undergone variable length binarization (thereby generating bins) with relatively reduced complexity are subsequently subjected to arithmetic coding, thereby generating CABAC bits, which are inserted into the compressed bitstream. Alternatively, in the case of CAVLC coding, the syntax elements that have undergone variable length binarization with relatively reduced complexity are in fact CAVLC bits (which are inserted into the compressed bitstream).
As can be seen in the present figure, while the complexity of variable length binarization is relatively reduced, the arithmetic coding complexity utilized according to CABAC entropy coding is not necessarily relatively reduced. That is, the arithmetic coding according to CABAC entropy coding may not be changed. Thus, it should be appreciated that based on the reduced complexity of variable length binarization utilized for both CABAC and CAVLC entropy coding, the overall compression for both will be correspondingly reduced.
Fig. 20B shows an embodiment 2001 of a unified architecture for both CABAC and CAVLC entropy decoding, and in particular, where reduced complexity variable length bin decoding is utilized in accordance with CABAC entropy decoding. The operation in the embodiment 2001 of fig. 20B may be properly understood as a complementary or inverse operation to that shown in the embodiment 2000 of fig. 20A.
Implementation 2001 generally depicts relatively reduced complexity variable length bin decoding that is utilized jointly for both CABAC and CAVLC entropy decoding. As can be seen with respect to fig. 20B, variable length bin decoding with relatively reduced complexity is utilized and shared for both CABAC and CAVLC entropy decoding. Depending on the particular implementation being utilized, either CABAC bits or CAVLC bits are received by the entropy decoding architecture and/or method. According to CAVLC entropy decoding, the received CAVLC bits undergo variable length bin decoding with relatively reduced complexity, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced from CAVLC entropy decoding are the same as those already produced from CAVLC entropy encoding.
According to CABAC entropy decoding, the received CABAC bits undergo arithmetic decoding, producing bins, which are then subjected to variable length bin decoding of relatively reduced complexity, producing syntax elements. Ideally (e.g., assuming no unrecoverable errors, deleterious effects, etc.), the syntax elements produced according to CABAC entropy decoding are the same as those syntax elements that have been produced according to CABAC entropy encoding.
As can be seen in the context of the present figure, while the variable length bin decoding complexity is relatively reduced, the arithmetic decoding complexity utilized in accordance with CABAC entropy decoding is not necessarily relatively reduced. That is, arithmetic decoding according to CABAC entropy decoding may not be changed. To this end, it should be understood that the overall compression of the bits received from both CABAC and CAVLC entropy decoding will correspond to a relatively reduced compression; thus, the complexity of variable length binarization utilized for both CABAC and CAVLC entropy decoding will also be correspondingly relatively reduced.
As will be appreciated in light of the various embodiments and/or figures presented herein, the designer is provided with a great deal of flexibility in implementing various functional blocks and/or methods in a manner that certain functional blocks and/or methods may be shared and used in common according to CABAC/CAVLC entropy encoding and CABAC/CAVLC entropy decoding. Depending on the particular application, various trade-offs may be made between the complexity of the commonly shared functional blocks and/or methods versus the non-shared functional blocks and/or methods, depending on the CABAC and CAVLC entropy encoding architectures and/or methods. For example, if CABAC encoding requires common or fixed compression, a trade-off may be made between the complexity of variable length binarization/variable length bin decoding and the complexity of the corresponding arithmetic encoding/decoding functional blocks and/or methods. Alternatively, if CABAC or CAVLC encoding requires increased or decreased compression, appropriately selected and modified variable length binarization/variable length bin decoding and corresponding arithmetic encoding/decoding functional blocks and/or methods may be implemented.
It may be well understood that by utilizing certain commonly shared and commonly utilized functional blocks and/or methods within a CABAC/CAVLC entropy encoding architecture and/or method, savings are made in circuitry, hardware, software, processing modules, components, real estate, grain size, and/or processing resources, etc., as many are used to support both CABAC and CAVLC encoding.
Fig. 21, 22A, 22B, 23, 24A and 24B illustrate various embodiments of methods performed in accordance with video encoding (e.g., within one or more communication devices).
Referring to method 2100 of fig. 21, method 2100 begins by receiving a plurality of syntax elements, as shown in block 2110. The method 2100 continues with operating the variable length binarization module to process the plurality of syntax elements to generate a plurality of Context Adaptive Binary Arithmetic Coding (CABAC) bins or a plurality of Context Adaptive Variable Length Coding (CAVLC) bits, as shown at block 2120.
The method 2100 then operates with an operating arithmetic coding module to process the plurality of CABAC bins to produce a plurality of CABAC bits, as shown at block 2130. From some perspectives and for some implementations, the operations of blocks 2120 and 2130 can be viewed as corresponding to entropy encoding (e.g., such as being performed in accordance with an entropy encoder operation). The method 2100 continues with selectively outputting a plurality of CABAC bits or a plurality of CAVLC bits, as indicated at block 2140.
Referring to method 2200 of FIG. 22A, method 2200 begins by receiving a plurality of syntax elements, as shown in block 2210. The method 2200 continues with operating the variable length binarization module to process the plurality of syntax elements to generate a plurality of Context Adaptive Variable Length Coding (CAVLC) bits, as shown at block 2220. The method 2200 then operates to output a plurality of CAVLC bits, as shown at block 2230.
Referring to method 2201 of fig. 22B, method 2201 begins by receiving a plurality of syntax elements, as shown at block 2211. The method 2201 then operates on the variable length binarization module to process the plurality of syntax elements to generate a plurality of Context Adaptive Binary Arithmetic Coding (CABAC) bins, as shown at block 2221. The method 2201 continues with operating the arithmetic coding module to process the plurality of CABAC bins to generate a plurality of CABAC bits, as shown at block 2231. The method 2201 then operates to output a plurality of CABAC bits, as shown in block 2241.
Referring to method 2300 of FIG. 23, method 2300 begins by receiving a plurality of binary arithmetic coding (CABAC) bits or a plurality of Context Adaptive Variable Length Coding (CAVLC) bits, as shown at block 2310. The method 2300 continues with operating the arithmetic decoding module to process the plurality of CABAC bits to produce a plurality of CABAC bins, as shown at block 2320.
Method 2300 then operates with operating the variable length bin decoding module to process the plurality of CAVLC bits or the plurality of CABAC bins to produce a plurality of estimates of the plurality of syntax elements, as shown in block 2330. From some perspectives and for some implementations, the operations of blocks 2320 and 2330 may be viewed as corresponding to entropy decoding (e.g., such as being performed in accordance with entropy decoder operations).
Referring to method 2400 of FIG. 24A, method 2400 begins with receiving a plurality of Context Adaptive Variable Length Coding (CAVLC) bits, as shown at block 2410. The method 2400 continues with operating the variable length bin decoding module to process the plurality of CAVLC bits to generate a plurality of estimates of a plurality of syntax elements, as shown at block 2420.
Referring to method 2401 of fig. 24B, method 2401 begins by receiving a plurality of binary arithmetic coding (CABAC) bits, as shown at block 2411. The method 2401 then operates with an operating arithmetic decoding module to process the plurality of CABAC bits to generate a plurality of CABAC bins, as shown at block 2421. The method 2401 continues with operating the variable length bin decoding module to process the CABAC bins to generate estimates for syntax elements, as shown in block 2431.
It is also noted that various operations and functions described with respect to the various methods herein may be performed within a communication device, such as using a baseband processing module and/or a processing module implemented therein and/or other components therein.
As used herein, the terms "substantially" and "approximately" provide industry-accepted tolerances for their corresponding terms and/or relativity between items. The industry-accepted tolerance ranges from one percent to fifty percent and corresponds to, but is not limited to, component values, integrated circuit process variations, temperature variations, rise and fall times, and/or thermal noise. Relativity between terms ranges from a few percent difference to a large difference. As also used herein, the terms "operatively coupled," "coupled," and/or "coupled" include direct coupling between items and/or indirect coupling between items via intermediate items (e.g., items include, but are not limited to, components, elements, circuits, and/or modules), where, for indirect coupling, the intermediate items do not alter information of a signal, but may adjust its current level, voltage level, and/or power level. As also used herein, inferred coupling (i.e., where one element is coupled to another element by inference) includes direct and indirect coupling between two items in the same manner as "coupled to". Also as used herein, the term "operable to" or "operably coupled to" indicates that an item includes one or more of a power connection, input, output, etc., that when enabled performs one or more of its corresponding functions, and also includes inferred coupling to one or more other items. As also used herein, the term "associated with" includes direct and/or indirect coupling of an individual item and/or one item embedded within another item. As used herein, the term "compares favorably", indicates that a comparison of two or more terms, signals, etc., provides a desired relationship. For example, a favorable comparison may be achieved when the desired relationship is that signal 1 has a greater magnitude than signal 2, when signal 1 has a greater magnitude than signal 2, or when signal 2 has a lesser magnitude than signal 1.
Also as used herein, the terms "processing module," "processing circuit," and/or "processing unit" (e.g., including various modules and/or circuits such as may operate, implement, and/or for encoding, for decoding, for baseband processing, etc.) may be a single processing device or a plurality of processing devices. The processing device may be a microprocessor, microcontroller, digital signal processor, microcomputer, central processing unit, field programmable gate array, programmable logic device, state machine, logic circuitry, analog circuitry, digital circuitry, and/or any device that manipulates signals (analog and/or digital) based on hard coded circuitry and/or operational instructions. A processing module, processing circuit, and/or processing unit may have associated memory and/or integrated memory elements, which may be a single memory device, multiple memory devices, and/or embedded circuitry of the processing module, processing circuit, and/or processing unit. The memory device may be Read Only Memory (ROM), Random Access Memory (RAM), volatile memory, non-volatile memory, static memory, dynamic memory, cache memory, and/or any device that stores digital information. Note that if the processing module, processing circuit, and/or processing unit includes more than one processing device, the processing devices may be centrally located (e.g., directly coupled together via a wired and/or wireless bus structure) or distributively located (e.g., cloud computing indirectly coupled via a local area network and/or a wide area network). It is also noted that if the processing module, processing circuit, and/or processing unit implements one or more of its functions via a state machine, analog circuitry, digital circuitry, and/or logic circuitry, the memory and/or memory elements storing the corresponding operational instructions may be embedded within, or external to, the circuitry comprising the state machine, analog circuitry, digital circuitry, and/or logic circuitry. It is also noted that the memory elements may store, and the processing modules, processing circuits, and/or processing units execute, hard-coded and/or operational instructions corresponding to at least some of the steps and/or functions depicted in one or more of the figures. The memory device or memory element may be included in an article of manufacture.
The present invention has been described above with the aid of method steps illustrating specific functions and relationships of the invention. The boundaries and sequence of these functional building blocks and method steps have been arbitrarily defined herein for the convenience of the description. Alternate boundaries and sequences may be defined so long as the specified functions and relationships are appropriately performed. Accordingly, any alternative boundary or sequence is within the scope and spirit of the invention. Further, boundaries of these functional building blocks have been arbitrarily defined for the convenience of the description. Alternate boundaries may be defined so long as certain important functions are properly performed. Likewise, flow diagram blocks may have been arbitrarily defined herein to illustrate certain important functions. To the extent used, the flow diagram block boundaries and sequence may be otherwise defined and still perform some significant functions. Alternative definitions of both functional building blocks and flowchart blocks and sequences are within the scope and spirit of the present invention. Those of ordinary skill in the art will also recognize that the functional building blocks and other illustrative blocks, modules, and components herein can be implemented as discrete components, application specific integrated circuits, processors executing appropriate software, etc., or any combination thereof.
The invention may also have been described, at least in part, in terms of one or more embodiments. The embodiments of the invention are used herein to illustrate the invention, inventive aspects, inventive features, inventive concepts and/or inventive examples. Physical embodiments of devices, articles of manufacture, machines and/or processes embodying the present invention may include one or more of the aspects, features, concepts, examples, etc., described with reference to the embodiments discussed herein. Furthermore, embodiments may include the same or similarly named functions, steps, modules, etc. that may use the same or different reference numbers, as the functions, steps, modules, etc. may be the same or similar functions, steps, modules, etc. or different functions, steps, modules, etc. in accordance with the various figures.
Signals to, from, and/or between elements in any of the figures presented herein may be analog or digital, continuous-time or time discrete, and single-ended or differential, unless specifically stated to the contrary. For example, if the signal path is shown as a single-ended path, it also represents a differential signal path. Likewise, if the signal path is shown as a differential path, it also represents a single-ended signal path. Although one or more particular architectures are described herein, it will be apparent to those of ordinary skill in the art that other architectures may be implemented as well, using one or more data buses, direct connections between elements, and/or indirect couplings between other elements that are not explicitly shown.
The term "module" is used in the description of various embodiments of the present invention. A module includes functional blocks that are implemented via hardware to perform one or more module functions, such as processing one or more input signals to generate one or more output signals. The hardware implementing the module may itself operate in conjunction with software and/or firmware. As used herein, a module may include one or more sub-modules, which are themselves modules.
Although specific combinations of features and functions are described herein, other combinations of features and functions are also possible. The invention is not limited to the specific examples disclosed herein and these other combinations are expressly included.
Claims (8)
1. A device for unified binarization for CABAC/CAVLC entropy coding, comprising:
an input to receive a plurality of syntax elements;
an entropy encoder operative to adaptively perform entropy encoding according to a plurality of complexity operation modes, the entropy encoder comprising:
a variable length binarization module to process the plurality of syntax elements to generate a plurality of Context Adaptive Binary Arithmetic Coding (CABAC) bins or a plurality of Context Adaptive Variable Length Coding (CAVLC) bits; and
an arithmetic coding module that processes the plurality of CABAC bins to produce a plurality of CABAC bits; and
at least one output to selectively output the plurality of CABAC bits or the plurality of CAVLC bits, or to output the CABAC bits when the entropy encoder operates according to a first complexity operation mode of the plurality of complexity operation modes and to output the CAVLC bits when the entropy encoder operates according to a second complexity operation mode of the plurality of complexity operation modes,
wherein the plurality of CABAC bins are the plurality of CAVLC bits.
2. The apparatus of claim 1, wherein:
the entropy encoder performs entropy encoding according to the first complexity operational mode of the plurality of complexity operational modes at or during a first time; and
the entropy encoder performs entropy encoding according to the second complexity operational mode of the plurality of complexity operational modes at or during a second time; and is
The second one of the plurality of complexity operation modes is less complex than the first one of the plurality of complexity operation modes.
3. The apparatus of claim 1, wherein:
the apparatus is a first communication device; and further comprising:
a second communication device that communicates with the first communication device via at least one communication channel, the second communication device comprising:
at least one additional input receiving the plurality of CABAC bits or the plurality of CAVLC bits; and
an entropy decoder, comprising:
an arithmetic decoding module that processes the plurality of CABAC bits to produce a plurality of CABAC bins; and
a variable length bin decoding module that processes the plurality of CAVLC bits or the plurality of CABAC bins to produce a plurality of estimates of the plurality of syntax elements; and wherein:
the second communication device is at least one of a computer, a laptop computer, a High Definition (HD) television, a Standard Definition (SD) television, a handheld media unit, a set-top box (STB), and a Digital Video Disc (DVD) player.
4. A method for operating an entropy encoder of a communication device, the method comprising:
receiving a plurality of syntax elements;
operating a variable length binarization module of the entropy encoder to process the plurality of syntax elements to produce a plurality of Context Adaptive Binary Arithmetic Coding (CABAC) bins or a plurality of Context Adaptive Variable Length Coding (CAVLC) bits; and
operating an arithmetic coding module of the entropy encoder to process the plurality of CABAC bins to produce a plurality of CABAC bits; and
selectively outputting the plurality of CABAC bits or the plurality of CAVLC bits,
wherein the plurality of CABAC bins are the plurality of CAVLC bits.
5. The method of claim 4, wherein:
the entropy encoder is operative to adaptively perform entropy encoding according to a plurality of complexity operation modes.
6. The method of claim 4, further comprising:
operating the entropy encoder to perform entropy encoding according to a first complexity mode of operation at or during a first time; and
operating the entropy encoder to perform entropy encoding according to a second complexity operational mode at or during a second time.
7. The method of claim 4, further comprising:
outputting the CABAC bits when the entropy encoder operates according to a first complexity operation mode; and
outputting the CAVLC bits when the entropy encoder operates according to a second complexity operation mode that is less complex than the first complexity operation mode.
8. The method of claim 4, further comprising:
operating an additional communication device to communicate with the communication device via at least one communication channel by:
receiving the plurality of CABAC bits or the plurality of CAVLC bits;
operating an arithmetic decoding module to process the plurality of CABAC bits to produce a plurality of CABAC bins;
operating a variable length bin decoding module to process the plurality of CAVLC bits or the plurality of CABAC bins to generate a plurality of estimates of the plurality of syntax elements, wherein the additional communication device is at least one of a computer, a laptop computer, a High Definition (HD) television, a Standard Definition (SD) television, a handheld media unit, a Set Top Box (STB), and a Digital Video Disc (DVD) player.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201161515819P | 2011-08-05 | 2011-08-05 | |
| US61/515,819 | 2011-08-05 | ||
| US13/523,818 | 2012-06-14 | ||
| US13/523,818 US9231616B2 (en) | 2011-08-05 | 2012-06-14 | Unified binarization for CABAC/CAVLC entropy coding |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1179791A1 HK1179791A1 (en) | 2013-10-04 |
| HK1179791B true HK1179791B (en) | 2016-12-30 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11800086B2 (en) | Sample adaptive offset (SAO) in accordance with video coding | |
| EP2575366B1 (en) | Signaling of coding unit prediction and prediction unit partition mode for video coding | |
| KR101437027B1 (en) | (Adaptive loop filtering in accordance with video coding | |
| US9693064B2 (en) | Video coding infrastructure using adaptive prediction complexity reduction | |
| US9231616B2 (en) | Unified binarization for CABAC/CAVLC entropy coding | |
| US20130343447A1 (en) | Adaptive loop filter (ALF) padding in accordance with video coding | |
| TWI524739B (en) | Sample adaptive offset (sao) in accordance with video coding | |
| CN103108180B (en) | A kind of method based on infrastructure ability and conditions present determination Video coding sub-block size and device thereof | |
| US20130235926A1 (en) | Memory efficient video parameter processing | |
| HK1179791B (en) | Unified binarization for cabac/cavlc entropy coding | |
| HK1180503B (en) | Adaptive loop filtering in accordance with video coding | |
| HK1185482A (en) | Sample adaptive offset (sao) in accordance with video coding | |
| HK1179452A (en) | Signalling of prediction size unit in accordance with video coding | |
| HK1185481A (en) | Frequency domain sample adaptive offset (sao) | |
| HK1183578B (en) | Method and apparatus of video coding sub-block sizing determining based on infrastructure capabilities and current conditions |