[go: up one dir, main page]

HK1182249B - Intra smoothing filter for video coding - Google Patents

Intra smoothing filter for video coding Download PDF

Info

Publication number
HK1182249B
HK1182249B HK13109524.1A HK13109524A HK1182249B HK 1182249 B HK1182249 B HK 1182249B HK 13109524 A HK13109524 A HK 13109524A HK 1182249 B HK1182249 B HK 1182249B
Authority
HK
Hong Kong
Prior art keywords
mode
intra
prediction
filtering
block
Prior art date
Application number
HK13109524.1A
Other languages
Chinese (zh)
Other versions
HK1182249A1 (en
Inventor
格尔特.范德奥维拉
翔林.王
穆罕默德.蔡德.科班
马尔塔.卡切维奇
郑云非
Original Assignee
高通股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/248,968 external-priority patent/US9008175B2/en
Application filed by 高通股份有限公司 filed Critical 高通股份有限公司
Publication of HK1182249A1 publication Critical patent/HK1182249A1/en
Publication of HK1182249B publication Critical patent/HK1182249B/en

Links

Description

Intra smoothing filter for video coding
Technical Field
This disclosure relates to video coding, and more particularly, to intra-prediction video coding.
Background
Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, digital media players, video gaming devices, video gaming consoles, cellular or satellite radio telephones, video teleconferencing devices, and the like. Digital video devices implement video compression techniques, such as those described in the standards and extensions of the standards defined by MPEG-2, MPEG-4, ITU-T H.263, or ITU-T H.264/MPEG-4 part 10 Advanced Video Coding (AVC), to more efficiently transmit and receive digital video information. New video standards continue to emerge and evolve, such as the High Efficiency Video Coding (HEVC) standard being developed by the "joint collaboration team-video coding" (JCTVC), which is a collaboration between MPEG and ITU-T. This new HEVC standard is sometimes referred to as h.265.
Video compression techniques perform spatial prediction and/or temporal prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video frame or slice may be partitioned into blocks referred to as Coding Units (CUs). In general, a CU has a similar purpose to a macroblock of the h.264 standard, except that the CU does not have a size difference. A CU in an intra-coded (I) frame or slice is encoded using spatial prediction with respect to neighboring CUs in the same frame or slice. A CU in an inter-coded (P or B) frame or slice may use spatial prediction with respect to neighboring CUs in the same frame or slice or temporal prediction with respect to other reference frames.
Disclosure of Invention
In general, techniques are described for applying smoothing filters to pixels around prediction samples used in intra-prediction coding of video data. In particular, the techniques provide a method of determining and applying a smoothing filter to prediction samples used in intra-prediction coding. In some examples, the techniques may include selecting a smoothing filter to apply to pixels surrounding a prediction sample based on a combination of a block size associated with a current block and an intra-prediction mode.
In fixed-mode dependent intra smoothing, a video encoder may determine the type of smoothing filter to apply to the prediction samples based on the block size associated with the current block in combination with the intra prediction mode. The filter may be looked up in a filter table using the block size and intra prediction mode combination. In this case, there may be no overhead sent to the decoder to signal the filter type for smoothing, as the filter table will be available at the decoder side, and the block size and intra prediction mode combination may be used at the decoder side in a similar manner.
In adaptive mode-dependent intra smoothing, a video encoder may: selecting a smoothing filter from two filter options, one from a first filter table and the other from a second filter table; applying two filters; and determine which one gives better results. In this case, the syntax indicating the selected filter may be encoded with the video and sent to the decoder to signal the selected filter type. In one example, the video encoder encodes a filtering indication if the filter from the second filter table yields better results. When a filter from the first filter table is used, no filtering indication is encoded.
These techniques may reduce the amount of additional data encoded with blocks encoded using intra-prediction coding, and thus reduce the amount of data transmitted to a decoder. A video decoder receiving an intra-prediction encoded block may apply a smoothing filter using a first filter table during decoding unless a filtering indication is encoded with the block. If the filtering indication is encoded with the block, the video decoder may apply a smoothing filter during decoding using the filter indicated by the filtering indication.
In one example, the present invention is directed to a method comprising: the method may include determining a block size associated with a current block of video data, determining an intra-prediction coding mode associated with the current block, selecting a filter from a filter table to apply to the current block based on the block size and the coding mode, and intra-prediction coding the current block using the selected filter.
In another example, this disclosure is directed to an apparatus for coding video data, the apparatus comprising: a processor configured to determine a block size associated with a current block of video data, determine an intra-prediction coding mode associated with the current block, select a filter from a filter table to apply to the current block based on the block size and the coding mode, and intra-prediction code the current block using the selected filter.
In another example, this disclosure is directed to an apparatus for encoding video data, the apparatus comprising: the apparatus generally includes means for determining a block size associated with a current block of video data, means for determining an intra-prediction coding mode associated with the current block, means for selecting a filter from a filter table to apply to the current block based on the block size and the coding mode, and means for intra-prediction coding the current block using the selected filter.
In another example, this disclosure is directed to a computer-readable storage medium comprising instructions for coding video data that, when executed in a processor, cause the processor to: the method may include determining a block size associated with a current block of video data, determining an intra-prediction coding mode associated with the current block, selecting a filter from a filter table to apply to the current block based on the block size and the coding mode, and intra-prediction coding the current block using the selected filter.
Drawings
FIG. 1 is a block diagram illustrating an example video encoding and decoding system that may utilize the techniques of this disclosure to support intra-smoothing.
Fig. 2A is a conceptual diagram illustrating a graph depicting an example set of prediction directions associated with various intra-prediction modes.
Fig. 2B is a conceptual diagram illustrating another graph depicting an example set of prediction directions associated with various intra-prediction modes.
Fig. 3 is a conceptual diagram illustrating an example set of prediction directions associated with various intra-prediction modes.
Fig. 4 is a block diagram illustrating an example of a video encoder that may implement techniques for supporting intra-smoothing.
Fig. 5 is a block diagram illustrating an example of a video decoder that decodes an encoded video sequence.
Fig. 6 is a flow diagram illustrating an example method for encoding video data in accordance with the techniques of this disclosure.
Fig. 7 is a flow diagram illustrating an example method for decoding video data in accordance with the techniques of this disclosure.
Fig. 8 is a flow diagram illustrating another example method for encoding video data in accordance with the techniques of this disclosure.
Fig. 9 is a flow diagram illustrating another example method for decoding video data in accordance with the techniques of this disclosure.
Detailed Description
In general, techniques are described for determining and applying a smoothing filter to pixels around prediction samples used in intra-prediction coding of video data. The techniques may include determining whether to apply a smoothing filter to a current block of video data being encoded. For example, the determination of whether to apply a smoothing filter may be based on the size of the current block being encoded and the encoding mode associated with the current block. The encoding mode of the current block may be based on a direction associated with intra-prediction coding of the current block, or relative to the current block from which the block of the current block is predicted. A filtering decision table may be generated and used to make a decision as to whether to apply intra smoothing filtering to the current block and/or the type of intra smoothing filter applied to the current block. Thus, the filtering decision may be a function of the block size and intra-prediction mode associated with the current block.
In video coding, prediction may be utilized to increase efficiency and reduce the amount of encoded video data, and thus reduce the amount of video data transmitted or stored for subsequent retrieval and display. Two types of prediction are often used in modern video coding systems: inter prediction and intra prediction. In inter-prediction, a current video block is predicted from a video block in another frame in a video sequence. In intra prediction, a current video block is predicted from video blocks within the same frame.
As described above, intra-predictive coding involves encoding pixels of a current block of a frame relative to pixels of previously encoded blocks of the frame by performing directional extrapolation of pixels surrounding the current block. In this way, the surrounding pixels are used as prediction samples to form the pixels of the current block. In general, the pixels are used to form a current block along a specified direction. As the size of the current block increases, the prediction accuracy of pixels of the current block that are farther away from the prediction sample may degrade. To overcome the degradation in intra-prediction coding, smoothing is applied to the surrounding prediction samples.
The video coding process may apply a smoothing filter to the predicted pixels, perform and compare predictions using both filtered and unfiltered pixels, and determine and signal whether to use filtered or unfiltered pixels based on the comparison results. Using this approach, which may be referred to as adaptive intra-frame smoothing (AIS), results in increased encoder complexity because it requires a smoothing filter to be applied to each block and the results compared to the unfiltered blocks. Also, using this approach of explicitly signaling whether a particular smoothing filter should be applied increases overhead information associated with each coding unit. Additionally, entropy coding and decoding AIS data (e.g., signals, smoothing filters, etc.) increases the complexity of the encoding and decoding processes.
Another way to perform AIS is to utilize filter indicator tables at the encoder and decoder, where the indicator value depends on the block size and intra prediction mode, the indicator value indicating the prediction direction. The table may be generated during the design phase, thus avoiding an increase in encoder computational complexity, since the decision as to whether or not to filter the pixels is not performed during the encoding process. However, certain changes to the video encoder/decoder may require the generation of a new table in place of the current table.
The techniques of this disclosure propose a simplified AIS scheme or Mode Dependent Intra Smoothing (MDIS) based on rules, wherein the rules as to whether to apply a smoothing filter are based on the size of the current block and the prediction mode or direction. MDIS may order prediction modes in a hierarchy, which may be mapped to different block sizes, with more modes being filtered according to the hierarchy for increasing block sizes. In some examples, the determination of whether to apply a smoothing filter may further distinguish between larger blocks and smaller blocks, where the method of determining whether to apply a smoothing filter may vary based on block size. In addition, for some video sequences, such as those with directional texture, applying a smoothing filter may provide little or no benefit. The techniques of this disclosure may also make smoothing filter determinations to address these issues. The intra-prediction and smoothing filter information may be encoded in the header of the corresponding block and decoded by a video decoder during decoding to apply the appropriate smoothing filter.
Fig. 1 is a block diagram illustrating an example video encoding and decoding system 10 that may utilize the techniques of this disclosure to apply smoothing filters during intra-prediction coding. As shown in fig. 1, system 10 includes a source device 12, source device 12 transmitting encoded video to a destination device 14 via a communication channel 16. Source device 12 and destination device 14 may comprise any of a wide variety of devices. In some examples, source device 12 and destination device 14 may comprise wireless communication devices, such as wireless handsets, so-called cellular or satellite radiotelephones, or any wireless devices that may communicate video information over communication channel 16 (in which case communication channel 16 is wireless).
However, the techniques of this disclosure, which are generally related to techniques for intra-prediction coding, are not necessarily limited to wireless applications or environments. For example, these techniques may also be useful in a variety of other environments and devices, including devices that communicate via physical wire, optical fiber, or other physical or wireless media, which may be applied to over-the-air television broadcasts, cable television transmissions, satellite television transmissions, internet video transmissions, encoded digital video encoded onto a storage medium, or other scenarios. For example, the encoded video may be stored on a storage device for subsequent retrieval, decoding, and display by destination device 14. In addition, the encoding or decoding techniques may also be applied in a stand-alone device that does not necessarily communicate with any other device. Accordingly, communication channel 16 may comprise any combination of wireless or wired media or other storage media suitable for transmitting encoded video data, and devices 12, 14 may comprise any of a variety of wired or wireless media devices, such as a mobile phone, a smart phone, a digital media player, a set-top box, a television, a display, a desktop computer, a portable computer, a tablet computer, a game console, a portable gaming device, and so forth.
In the example of fig. 1, source device 12 includes a video source 18, a video encoder 20, a modulator/demodulator (modem) 22, and a transmitter 24. Destination device 14 includes a receiver 26, a modem 28, a video decoder 30, and a display device 32. In other examples, the source device and the destination device may include other components or arrangements. For example, source device 12 may receive video data from an external video source 18 (e.g., an external camera, a video storage archive, a computer graphics source, etc.). Likewise, destination device 14 may interface with an external display device, rather than including an integrated display device.
The system 10 illustrated in fig. 1 is merely one example. The techniques for applying a smoothing filter in intra-prediction coding may be performed by any digital video encoding and/or decoding device. Although the techniques of this disclosure are typically performed by a video encoding device, the techniques may also be performed by a video encoder/decoder (commonly referred to as a "CODEC"). Furthermore, the techniques of this disclosure may also be performed by a video preprocessor. Source device 12 and destination device 14 are merely examples of such coding devices, with source device 12 generating coded video data for transmission to destination device 14. In some examples, devices 12, 14 may operate in a substantially symmetric manner such that each of devices 12, 14 includes video encoding and decoding components. Thus, system 10 may support one-way or two-way video transmission between video devices 12, 14, such as for video streaming, video playback, video broadcasting, or video telephony.
Video source 18 of source device 12 may include a video capture device, such as a video camera, a video archive containing previously captured video, and/or video fed from a video content provider. As another alternative, video source 18 may generate computer graphics-based data as the source video, or a combination of live video, archived video, and computer-generated video. In some cases, if video source 18 is a video camera, source device 12 and destination device 14 may form so-called camera phones or video phones. However, as mentioned above, the techniques described in this disclosure may be applicable to video coding in general, and may be applied to wireless and/or wired applications. In each case, the captured, pre-captured, or computer-generated video may be encoded by video encoder 20. The encoded video information may then be modulated by modem 22 according to a communication standard and transmitted to destination device 14 via transmitter 24. Modem 22 may include various mixers, filters, amplifiers, or other components designed for signal modulation. Transmitter 24 may include circuitry designed for transmitting data, including amplifiers, filters, and one or more antennas.
Receiver 26 of destination device 14 receives information via channel 16 and modem 28 demodulates the information. Moreover, the video encoding process may implement one or more of the techniques described herein for applying a smoothing filter during intra-prediction coding of the current block. The information communicated over channel 16 may include syntax information defined by video encoder 20, which may also be used by video decoder 30, including syntax elements that describe characteristics and/or processing of Prediction Units (PUs), Coding Units (CUs), or other units of the coded video, such as video slices, video frames, and video sequences or groups of pictures (GOPs), display device 32 displays the decoded video data to a user, and may comprise any of a variety of display devices, such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or another type of display device.
In the example of fig. 1, communication channel 16 may comprise any wireless or wired communication medium, such as a Radio Frequency (RF) spectrum or one or more physical transmission lines, or any combination of wireless and wired media. The communication channel 16 may form part of a packet-based network, such as a local area network, a wide area network, or a global network such as the internet. Communication channel 16 generally represents any suitable communication medium or collection of different communication media, including any suitable combination of wired or wireless media, for transmitting video data from source device 12 to destination device 14. Communication channel 16 may include a router, switch, base station, or any other apparatus that may be used to facilitate communications from source device 12 to destination device 14.
Video encoder 20 and video decoder 30 may operate in accordance with a video compression standard, such as the incorporated High Efficiency Video Coding (HEVC) standard or the ITU-T h.264 standard, otherwise known as MPEG-4, part 10, Advanced Video Coding (AVC). However, the techniques of this disclosure are not limited to any particular coding standard. Other examples include MPEG-2 and ITU-T H.263. Although not shown in fig. 1, in some aspects, video encoder 20 and video decoder 30 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer (MUX-DEMUX) units or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams. The MUX-DEMUX unit, if applicable, may conform to the itu h.223 multiplexer protocol or other protocols such as the User Datagram Protocol (UDP).
The ITU-T H.264/MPEG-4(AVC) standard is a product formulated by the ITU-T Video Coding Experts Group (VCEG) along with the ISO/IEC Moving Picture Experts Group (MPEG) as a collective partnership known as the Joint Video Team (JVT). In some aspects, the techniques described in this disclosure may be applied to devices that generally conform to the h.264 standard. The h.264 standard, which may be referred to herein as the h.264 standard or the h.264 specification or the h.264/AVC standard or specification, is described by the ITU-T research group in the ITU-T recommendation h.264 "Advanced video coding for general audio visual services" at 3 months 2005.
The Joint Video Team (JVT) is constantly working on extensions to H.264/AVC and developing new standards, e.g., for HEVC. HEVC standardization efforts are based on a model of the video coding device, referred to as the HEVC test model (HM). The HM assumes several additional capabilities of the video coding device relative to existing devices in accordance with, for example, ITU-T H.264/AVC. For example, h.264 provides nine intra-prediction encoding modes, while HM provides up to thirty-five intra-prediction encoding modes.
HM refers to a block of video data as a Coding Unit (CU). Syntax data in the bitstream may define a Largest Coding Unit (LCU), which is the largest coding unit in terms of pixel data. In general, a CU has a similar purpose to a macroblock of the h.264 standard, except that the CU does not have a size difference. Thus, a CU may be split into several sub-CUs. In general, a reference to a CU in this disclosure may refer to a largest coding unit of a picture or a sub-CU of an LCU. An LCU may be split into sub-CUs, and each sub-CU may be further split into sub-CUs. Syntax data for a bitstream may define the maximum number of times an LCU may be split, referred to as CU depth. Thus, the bitstream may also define a Smallest Coding Unit (SCU). This disclosure also refers to any of a CU, PU, or TU with the term "block. Furthermore, where this disclosure refers to examples involving coding units or CUs, it should be understood that other examples may be provided with respect to macroblocks being replaced by coding units.
The LCU may be associated with a quadtree data structure. In general, a quadtree data structure contains one node per CU, where the root node corresponds to an LCU. If a CU is split into four sub-CUs, the node corresponding to the CU includes four leaf nodes, each corresponding to one of the sub-CUs. Each node of the quadtree data structure may provide syntax data for the corresponding CU. For example, a node in the quartile may include a split flag indicating whether the CU corresponding to the node is split into sub-CUs. Syntax elements of a CU may be defined recursively and may depend on whether the CU is split into sub-CUs.
An un-split CU (e.g., corresponding to a leaf node in a quadtree data structure) may include one or more Prediction Units (PUs). In general, a PU represents all or a portion of a corresponding CU, and includes data for retrieving a reference sample for the PU. For example, when the PU is intra-mode encoded, the PU may include data describing an intra-prediction mode of the PU. The data describing the intra-prediction mode may define, for example, directions (e.g., luma and chroma components) associated with the intra-prediction mode of the PU and a block size associated with the PU. As another example, when the PU is inter-mode encoded, the PU may include data defining a motion vector for the PU. The data defining a motion vector may describe, for example, a horizontal component of the motion vector, a vertical component of the motion vector, a resolution of the motion vector (e.g., one-quarter pixel precision or one-eighth pixel precision), a reference frame to which the motion vector points, and/or a reference frame list (e.g., list 0 or list 1) of the motion vector, as well as a motion vector predictor index. The data defining a CU of a PU may also describe, for example, partitioning the CU into one or more PUs. The partition mode may be different between CU uncoded, intra-mode encoded, or inter-prediction mode encoded.
A CU having one or more PUs may also include one or more Transform Units (TUs). Following prediction using a PU, a video encoder may calculate residual values for the portion of the CU that corresponds to the PU. The residual values correspond to pixel difference values, which may be transformed into transform coefficients, quantized, and scanned to generate serialized transform coefficients for entropy coding. The size of a TU is not necessarily limited to the size of a PU. In particular, TUs may be the same size, or smaller than corresponding PUs of a CU. In some embodiments, the maximum size of a TU may be the size of the corresponding CU. In general, a TU may comprise a data structure that includes residual transform coefficients associated with a given CU.
Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder circuits, such as one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), discrete logic, software, hardware, firmware or any combinations thereof. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, either of video encoder 20 and video decoder 30 may be integrated as part of a combined encoder/decoder (CODEC) in a respective camera, computer, mobile device, subscriber device, broadcast device, set-top box, server, or the like.
A video sequence typically comprises a series of video frames. A group of pictures (GOP) typically includes a series of one or more video frames. The GOP may include syntax data in a header of the GOP, a header of one or more frames of the GOP, or elsewhere, that describes the number of frames included in the GOP. Each frame may include frame syntax data that describes the encoding mode of the respective frame. Video encoder 20 typically operates on video blocks within individual video frames in order to encode the video data. The video block may correspond to a Coding Unit (CU) or a Partition Unit (PU) of a CU. Video blocks may have fixed or different sizes, and their sizes may differ according to a specified coding standard. Each video frame may include multiple slices. Each slice may include multiple CUs, which may include one or more PUs.
For example, HEVC test model (HM) supports prediction for various CU sizes. The size of the LCU may be defined by syntax information. Assuming that the size of a particular CU is 2N × 2N, the HM supports intra prediction of 2N × 2N or N × N size, and inter prediction of 2N × 2N, 2N × N, N × 2N, or N × N symmetric size. The HM also supports asymmetric splitting of inter prediction of 2N × nU, 2N × nD, nL × 2N and nR × 2N. In asymmetric splitting, one direction of a CU is not split, while the other direction is split into 25% and 75%. The portion of the CU corresponding to 25% split is indicated by an indication of "n" followed by "up", "down", "left", or "right". Thus, for example, "2N × nU" refers to a horizontally split 2N × 2N CU, with 2N × 0.5N PU at the top and 2N × l.5N PU at the bottom.
In this disclosure, "nxn" and "N by N" are used interchangeably to refer to the pixel size, e.g., 16 x 16 pixels or 16 by 16 pixels, of a video block (e.g., CU, PU, or TU) in terms of the vertical and horizontal dimensions. In general, a 16 x 16 block will have 16 pixels in the vertical direction (y =16) and 16 pixels in the horizontal direction (x = 16). Likewise, an nxn block typically has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value. The pixels in a block may be arranged in rows and columns. Further, the block does not necessarily have to have the same number of pixels in the horizontal direction as in the vertical direction. For example, a block may comprise N × M pixels, where M is not necessarily equal to N.
As described above, a CU may be encoded using inter-prediction coding or intra-prediction coding. In intra prediction, a current block may be predicted with respect to a block that has been encoded in the same frame. Typically, blocks of a frame are coded from left to right and top to bottom, or in some examples, in zigzag scan order. Thus, the current block may be predicted with respect to a block anywhere above the current block or a block to the left of the current block. The block size of intra-prediction coding may vary from a relatively small size (e.g., 2 x 2) to a relatively large size (e.g., 128 x 128). The prediction direction or the direction from an already encoded block to the current block may determine an intra prediction mode. The HM allows up to 34 modes, which include 33 different intra prediction directional modes (corresponding to the extrapolation direction) and one DC mode. Each of the 33 intra-prediction orientation modes corresponds to an angle between 0 and 180 (or between 45 and 135 degrees clockwise from horizontal, in the example of fig. 2A), where an angular resolution close to 5.625 ° indicates an extrapolation direction of pixels from the block used to predict the pixels of the current block. The non-directional mode (denoted by mode 2) corresponds to DC, where samples of neighboring pixels are used to predict pixels of the current block. In some examples, the HM may allow a 35 th mode (mode 34) corresponding to the planar mode, where the value of the pixel may be adjusted using a gradient function discussed in more detail below.
During intra-prediction coding, the quality of prediction accuracy may be affected and degraded. For example, as the size of a PU increases, the prediction accuracy of pixels farther from the predicted voxel may degrade. Degradation in prediction accuracy may result in more residual data, which may result in the need to encode, transmit, or store more data. A smoothing filter may be applied to the predicted voxels used to predict the block to improve prediction accuracy. However, in some cases, better results are achieved without using a smoothing filter. As noted above, current standards utilize adaptive intra-frame smoothing (AIS) techniques.
The technique of the present invention proposes a simplified AIS scheme based on rules, wherein the rules regarding whether to apply a smoothing filter are mainly based on the size of the current block and the prediction mode or direction. For example, in the HM, the block size may be changed from a smaller size (e.g., 2 × 2) to as large as 128 × 128. As will be explained in more detail below, the intra-prediction modes available for each block may vary based on size. For each block size, a smoothing filter may be needed when applying prediction in some directions but not in other directions, where the directions correspond to different intra-prediction modes. In one example, the combination of the size and the intra-prediction mode for which the smoothing filter is applied may be predefined in a look-up table (LUT). Thus, determining whether to apply a smoothing filter may be generic and remain the same regardless of encoder and/or decoder variations, and signaling of the filtering determination may not be required. In addition, making a determination whether to apply a smoothing filter does not introduce significant encoder complexity.
In some examples, the determination of whether to apply a smoothing filter may further distinguish between larger blocks and smaller blocks, where the method of determining whether to apply a smoothing filter may vary based on block size. For example, for some block sizes, the encoder may perform prediction using filtered and unfiltered pixels, and compare the results to determine whether to use a smoothing filter. The encoder then signals the filtering mode, i.e., whether to use a smoothing filter when encoding the block. In one example, if the filtering mode is different from the decision to be obtained based on the LUT (i.e., the default filtering decision), the filtering mode is signaled.
In one example, one or more smoothing filters (e.g., filter candidates) may be defined. For each block size and intra-prediction mode combination, the table may define the appropriate inactive filters. For combinations for which no smoothing filter is used, the defined filter may be an all-pass filter. The primary table may include a most probable smoothing filter for each corresponding combination of block size and intra-prediction mode. In some examples, a secondary table (or additional tables) may be defined to include a second most probable smoothing filter (or possible smoothing filters, in the case of additional tables) for each corresponding combination of block size and intra-prediction mode. In one example, the most likely smoothing filter information may be utilized unless otherwise indicated. Information associated with intra-prediction coding (e.g., block size and intra-prediction mode) as well as smoothing filter information (e.g., filtering mode and possibly a smoothing filter table) may be encoded in a header (or other data structure, such as a footer) of the corresponding block and decoded by a video decoder during decoding to apply the appropriate smoothing filter.
After intra-prediction or inter-prediction coding to generate prediction data, e.g., PUs for a CU, video encoder 20 may calculate residual data to generate one or more Transform Units (TUs) for the CU. The PUs of a CU may comprise pixel data in the spatial domain (also referred to as the pixel domain), while the TUs of a CU may comprise coefficients in the transform domain (e.g., after applying a transform to the residual video data, such as a Discrete Cosine Transform (DCT), an integer transform, a wavelet transform, or a conceptually similar transform). The residual data may correspond to pixel differences between pixels of the unencoded picture and prediction values of PUs of the CU. Video encoder 20 may form one or more TUs, including residual data for the CU. Video encoder 20 may then transform the TU.
After any transform is performed to produce transform coefficients, quantization of the transform coefficients may be performed. Quantization generally refers to a process in which transform coefficients are quantized to possibly reduce the amount of data used to represent the coefficients. The quantization process may reduce the bit depth associated with some or all of the coefficients. For example, during quantization, an n-bit value may be rounded down to an m-bit value, where n is greater than m.
In some examples, video encoder 20 may utilize a predefined scan order to scan the quantized transform coefficients to generate a serialized vector that may be entropy encoded. In other examples, video encoder 20 may perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, video encoder 20 may entropy encode the one-dimensional vector, e.g., according to Content Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), or another entropy coding method.
A processing unit configured for entropy coding, or another processing unit, may perform other processing functions, such as zero run length coding of quantized coefficients, and/or generation of syntax information such as Coded Block Pattern (CBP) values, macroblock types, coding modes, LCU sizes, and so forth.
Video decoder 30 of destination device 14 may be configured to perform techniques similar to and substantially symmetrical to any or all of the techniques of video encoder 20 of this disclosure. For example, video decoder 30 may receive an entropy-encoded CU of a video frame. This received data may include syntax elements that indicate whether the CU was encoded with intra-prediction coding at video encoder 20. In accordance with the techniques of this disclosure, syntax elements associated with intra-prediction coding may include information such as: intra prediction mode or prediction direction, block size, and other information associated with the smoothing filter, as applicable. For example, video decoder 30 may determine to encode the CU using intra-prediction coding and decode syntax elements included with the encoded CU to determine an intra-prediction mode for the CU. Video decoder 30 may also determine the size of the block used in intra-prediction coding, and using the block size and intra-prediction mode information, video decoder 30 may determine whether to apply a smoothing filter to the predicted voxels when predicting the block.
In one example, video decoder 30 may determine whether to apply a smoothing filter based on a combination of the block size and the intra-prediction mode used to encode the block. If no additional indication is indicated, video decoder 30 may utilize a most probable smoothing filter associated with a combination of block size and intra-prediction mode. If the smoothing filter for the combination of block size and intra-prediction mode is different than the most probable smoothing filter for the combination, video decoder 30 may decode an indication from the syntax element associated with the block and video decoder 30 selects and utilizes the smoothing filter from the second (or more) most probable smoothing filters for the combination.
In this way, the amount of information encoded to indicate whether to apply a smoothing filter in intra-prediction coding is minimized, and in some cases eliminated. For example, when the smoothing filter application decision is the most likely smoothing filter for a combination of block size and intra-prediction mode, then no signaling is encoded regarding the smoothing filter, and video decoder 30 is able to make the determination without requiring additional information other than block size and intra-prediction mode.
Video encoder 20 and video decoder 30 may each be implemented as any of a variety of suitable encoder or decoder circuits, as applicable, such as one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), discrete logic circuitry, software, hardware, firmware, or any combinations thereof. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined video encoder/decoder (CODEC). An apparatus including video encoder 20 and/or video decoder 30 may comprise an integrated circuit, a microprocessor, and/or a wireless communication device, such as a cellular telephone.
Fig. 2A is a conceptual diagram illustrating a graph depicting an example set of prediction directions associated with various intra-prediction modes. As described above, the HM provides up to thirty-three intra-prediction directional coding modes. An HM encoder (e.g., video encoder 20) may be configured such that the set of available intra-prediction modes for a block may depend on the size of the block or PU. That is, the size of a block may determine the number of intra-prediction modes available for the block from which an encoder may select an intra-prediction mode to predict the block. Table 1 below illustrates one example of a correspondence between block sizes and intra-prediction modes that may be used for blocks of that size. The directions associated with the intra-prediction modes are illustrated in fig. 3. Mode 2, not shown on fig. 3, corresponds to a DC mode, in which the pixels of the current block are predicted using a sampling of neighboring pixels.
PU or block size Intra prediction mode
2×2 0~3
4×4 0~16
8×8 0~33
16×16 0~33
32×32 0~33
64×64 0~5
128×128 0~5
TABLE 1
In the example of fig. 2A, block 106 (indicated in dashed outline) may be predicted from neighboring pixels 100A-100 AG (neighboring pixel 100), depending on the selected intra-prediction mode. Arrows 102A-102 AG (arrow 102) represent directions or angles associated with various intra-prediction modes. The example of fig. 2A represents the intra prediction mode provided by the HM. However, in other examples, more or fewer intra-prediction modes may be provided. Although an example of a block 106 is an 8 x 8 block of pixels, in general, a block may have any number of pixels, such as 4 x 4, 8 x 8, 16 x 16, 32 x 32, 64 x 64, 128 x 128, and so on. Although the example of block 106 is square, the techniques of this disclosure may also be applied to other block sizes, such as N × M blocks, where N is not necessarily equal to M.
The intra-prediction mode may be defined according to an angle of the prediction direction relative to a horizontal axis, e.g., perpendicular to the vertical side of block 106. Thus, each of the arrows 102 may represent a particular angle of the prediction direction of the corresponding intra-prediction mode. In some examples, an intra-prediction direction mode may be defined by an integer pair (dx, dy), which may represent the direction of the corresponding intra-prediction mode used for context pixel extrapolation. That is, the angle of the intra prediction mode may be calculated as dy/dx. In other words, the angle may be expressed in terms of a horizontal offset dx and a vertical offset dy. The value of the pixel at location (x, y) in block 106 may be determined from one or more of the neighboring pixels 100 through which a line passes that also passes location (x, y) at angle dy/dx.
Fig. 2B is a conceptual diagram illustrating a graph depicting another set of example prediction directions associated with various intra-prediction modes. In the example of fig. 2B, thirty-three intra-prediction directional encoding modes are shown, similar to those shown in fig. 2A. In this example, in addition to the 33 directional modes, there are two additional modes, mode 2 corresponding to the DC mode and mode 34 corresponding to the planar mode.
In the example of fig. 2B, the encoder and/or decoder may be configured such that the available set of intra-prediction modes for a block may depend on the size of the block (e.g., PU, CU, or TU). That is, the size of a block may determine the number of intra-prediction modes available for the block from which an encoder may select an intra-prediction mode to predict the block. Table 2 below illustrates another example of a correspondence between block sizes and intra-prediction modes that may be used for blocks of that size.
PU or block size Intra prediction mode
4×4 0~16,34
8×8 0~34
16×16 0~34
32×32 0~34
64×64 0~2,34
TABLE 2
Fig. 3 is a conceptual diagram illustrating an example set of prediction directions associated with various intra-prediction modes. Each of the 33 intra prediction direction modes provided by the HM corresponds to a direction having a specified angle. For example, mode 1 corresponds to the horizontal extrapolation direction or 0 ° angle to the horizontal, where the block is predicted from pixels pointing to the left of the block encoded using mode 1. As another example, mode 0 corresponds to the vertical extrapolation direction or 90 ° from horizontal, where the block is predicted from pixels directly above the block encoded using mode 0. As another example, modes 3,6 and 9 correspond to a diagonal extrapolation direction with an angular resolution of 45 °. Modes 0, 1 and 3 to 33 correspond to prediction directions spanning 180 ° (between modes 6 and 9), with angular resolution approaching 5.625 °. In the example of HM, mode 2 corresponds to DC mode, where a sample of eight neighboring pixels is used to predict the pixels of the current block. Mode 34 corresponds to a planar mode, which may include a linear plane function of a block suitable for prediction purposes. The planar mode may be effective in smoothly varying illumination areas. The planar mode may be effective for predicting large blocks, but may not be effective enough for predicting small blocks. The circles of fig. 3 represent the angular resolution of the intra prediction direction mode associated with each circle. For example, for circle 0, the resolution between the modes within the circle (modes 0 and 1) is 90 degrees, the resolution between the modes within circle 1 (modes 3,6, and 9) is 45 degrees, the resolution between the modes within circle 2 (modes 4, 5, 7, and 8) is 22.5 degrees, and so on.
Fig. 4 is a block diagram illustrating an example of video encoder 20 that may implement techniques for applying smoothing filters to pixels around prediction samples used in intra-prediction coding of video data. Video encoder 20 may perform intra and inter prediction of video blocks (including LCUs, CUs, and PUs) within a video frame, and calculate residual values that may be encoded as TUs. Intra-coding relies on spatial prediction to reduce or remove spatial redundancy in video within a given video frame. Inter-coding relies on temporal prediction to reduce or remove temporal redundancy in video within adjacent frames of a video sequence. The intra mode (I-mode) may involve any of several space-based compression modes. Inter modes, such as uni-directional prediction (P-mode), bi-directional prediction (B-mode), or generalized P/B prediction (GPB-mode), may involve any of several time-based compression modes. Motion estimation unit 42 and motion compensation unit 44 may perform inter-prediction coding, while intra-prediction unit 46 may perform intra-prediction coding.
As shown in fig. 4, video encoder 20 receives a current video block within a video frame or slice to be encoded. In the example of fig. 4, video encoder 20 includes mode select unit 40, prediction unit 41, reference frame store 64, summer 50, transform unit 52, quantization unit 54, and entropy coding unit 56. Prediction unit 41 includes motion estimation unit 42, motion compensation unit 44, and intra prediction unit 46. For video block reconstruction, video encoder 20 also includes inverse quantization unit 58, inverse transform unit 60, and summer 62. A deblocking filter (not shown in fig. 4) may also be included to filter block boundaries to remove blocking artifacts from the reconstructed video. The deblocking filter will typically filter the output of summer 62, if desired.
During the encoding process, video encoder 20 receives a video frame or slice to be coded. A frame or slice may be divided into multiple video blocks or CUs. Motion estimation unit 42 and motion compensation unit 44 perform inter-prediction coding of the received video block relative to one or more blocks in one or more reference frames to provide temporal compression. Intra-prediction unit 46 may perform intra-prediction coding of the received video block relative to one or more neighboring blocks in the same frame or slice as the block to be coded to provide spatial compression.
Mode selection unit 40 may select one of the coding modes, intra or inter, based on the error results and provide the selection of the coding mode to prediction unit 41. Prediction unit 41 then generates an intra-or inter-coded block and provides the coded block to summer 50 to generate residual block data and to summer 62 to reconstruct the encoded block for use as a reference frame. When mode selection unit 40 selects intra-prediction encoding for a block, intra-prediction unit 46 may select an intra-prediction mode for the block. For example, the selected intra-prediction mode may be one of the 35 modes provided by the HM. The number of intra-prediction modes from which an intra-prediction mode is selected for a given block may depend on the size of the block, as shown in the example of table 1 above.
In one example, when mode selection unit 40 selects inter-prediction coding, motion estimation unit 42 and motion compensation unit 44 may perform processing associated with inter-prediction coding to generate inter-coded blocks. Motion estimation unit 42 and motion compensation unit 44 may be highly integrated, but are illustrated separately for conceptual purposes. Motion estimation, performed by motion estimation unit 42, is the process of generating motion vectors that estimate the motion of video blocks. For example, a motion vector may indicate a displacement of a prediction block or PU within a prediction reference frame (or other coded unit) relative to a current block being coded within a current frame (or other coded unit). A prediction block is a block found that closely matches the video block or PU to be coded in terms of pixel differences, which may be determined by Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD), or other difference metrics.
The motion vector may also indicate a shift of a partition of the macroblock. Motion compensation may involve fetching or generating a prediction block based on a motion vector determined by motion estimation. In some examples, video encoder 20 may calculate values for sub-integer pixel positions of reference frames stored in reference frame store 64. For example, video encoder 20 may calculate values for a quarter-pixel position, an eighth-pixel position, or other fractional-pixel positions of the reference frame. Thus, motion estimation unit 42 may perform a motion search with respect to the full pixel position and the fractional pixel position and output a motion vector with fractional pixel precision.
Motion estimation unit 42 computes motion vectors for video blocks or PUs of inter-coded frames by comparing the PU to blocks of reference frames in reference frame store 64. Motion compensation unit 44 may also interpolate sub-integer pixels of a reference frame (e.g., an I frame or a P frame). For example, HM describes two lists: list 0, which includes reference frames having an earlier display order than the current frame being encoded; and list 1, which contains reference frames with a later display order than the current frame being encoded. Thus, the data stored in the reference frame store 64 may be organized according to these lists. In some cases, motion estimation unit 42 may further calculate a Motion Vector Difference (MVD) value for the motion vector and pass the MVD value to entropy coding unit 56 instead of the motion vector itself.
In one example, when mode select unit 40 selects intra-prediction coding, intra-prediction unit 46 may perform processing associated with the intra-prediction coding to generate an intra-coded block. As described above, based on the size of a block, the number of intra-prediction modes available for the block may be determined. Examples of the correspondence between block sizes and the number of intra-prediction modes available for blocks of that size are shown in tables 1 and 2 above. Intra-prediction unit 46 may intra-prediction encode the current video block (assuming a left-to-right, top-to-bottom encoding order or block) relative to neighboring previously coded blocks (e.g., blocks above, above-right, above-left, or left of the current block). Intra-prediction unit 46 may utilize neighboring blocks based on the selected intra-prediction mode for the current block.
Intra-prediction unit 46 may determine the appropriate intra-prediction mode for the block by selecting one of the available intra-prediction modes. Intra-prediction unit 46 may select the intra-prediction mode by, for example, calculating error values for various intra-prediction modes associated with the block size and selecting the mode that yields the lowest error value. The directional prediction mode may include functionality for combining values of spatially neighboring pixels and applying the combined values to one or more pixel locations in a Prediction Unit (PU). Once the values for all pixel locations in the PU have been calculated, intra-prediction unit 46 may calculate an error value for the prediction mode based on the pixel differences between the PU and the current block to be encoded. Intra-prediction unit 46 may continue to test the intra-prediction mode associated with the block size of the current block until an intra-prediction mode is found that yields an acceptable error value. The intra prediction mode that yields the lowest error value may then be used to generate the PU for the current block. Intra-prediction unit 46 may signal the selected intra-prediction mode such that it is encoded with syntax elements associated with the coded block.
According to this disclosure, video encoder 20 may determine whether to apply a smoothing filter to pixels surrounding prediction samples for a current block encoded using intra-prediction coding. The determination may be based on a block size and an intra-prediction mode used in coding the current block. In one example in accordance with the techniques of this disclosure, when using modes 0, 1, or 2, which correspond to vertical, horizontal, and DC modes, respectively, no smoothing filter is applied, as each of these modes provides a relatively more accurate prediction result without applying smoothing. Therefore, when the prediction direction is completely horizontal, vertical, or the intra prediction mode is the DC mode, the smoothing filter is not applied. Determining whether to apply the smoothing filter to the other modes (i.e., the diagonal mode) depends on the prediction unit size and the intra prediction mode. Table 3 below shows example relationships for combinations of block sizes and intra-prediction modes to which smoothing filters are applied. The combination of block sizes and intra prediction modes to which filtering may be applied may be based on offline training and statistics. Table 3 does not represent the only types of relationships that may be defined, but merely provides examples consistent with the present disclosure.
Block size Mode to be filtered Shown in FIG. 3
2x2 No filtering for all modes n/a
4x4 Modes 3,6, and 9 Pattern between circle 0 and circle 1
8x8 Modes 3, 4, 5, 6, 7, 8, and 9 Mode between circle 0 and circle 2
16x16 Modes 3 to 17 Pattern between circle 0 and circle 3
32x32 Modes 3 to 33 Pattern between circle 0 and circle 4
64x64 Modes 3 to 4 Pattern between circle 0 and circle 4
128x128 Modes 3 to 4 Pattern between circle 0 and circle 4
TABLE 3
As illustrated in table 3, determining whether to apply a smoothing filter is general and remains the same based on the block size and the intra-prediction mode used to encode the block. In one example, the combination of size and intra-prediction mode shown above in table 3 and for which the smoothing filter is applied may be predefined in a look-up table (LUT). The LUT may be available at both the encoder or the decoder, or the encoder may send it to the decoder. Thus, determining whether to apply a smoothing filter may be generic and remain the same regardless of encoder and/or decoder variations, and signaling of the filtering determination may not be required. In addition, making a determination whether to apply a smoothing filter does not introduce significant encoder complexity.
In one example, during intra-prediction encoding of a block, an encoder may determine whether to apply a smoothing filter to the prediction samples based on a combination of the size of the block and the intra-prediction mode used and according to the combination defined in the LUT. These combinations may be associated with the most probable filtering options as indicated in the most probable filter table. For example, a most probable filter table may be created to indicate which filtering option to utilize for each combination of block size and intra-prediction mode. In one example, a smoothing filter may be selected from a candidate list of filters for each combination of block size and intra-prediction mode. The no filtering option may be generalized to a smoothing filter that is an all-pass filter. In some examples, one or more smoothing filters may be used, where, for example, one smoothing filter may work better for some combinations of block sizes and intra-prediction modes, and a second smoothing filter may work better for other combinations of block sizes and intra-prediction modes. The number of filters in the candidate set of smoothing filters may vary between systems. In one example, filter options for a combination of block size and intra-prediction mode may be based on offline training with different training video data. For simplicity, the number of filters may be 3, filter 0, filter 1, and filter 2, where filter 0 indicates all-pass filtering (i.e., no filtering). Using the candidate filters, a most probable filter table, represented in pseudo code, may be generated according to the following table:
in the above example, the most likely smoothing filter may be defined for each block size and intra-prediction mode. The table index for a particular combination of smoothing filters may be defined as [ S, M ] corresponding to size and mode. For example, the block sizes may range from 1 to 7, corresponding to block sizes 2 × 2, 4 × 4, 8 × 8, 16 × 16, 32 × 32, 64 × 64, and 128 × 128. For example, the intra-prediction modes may range from 1 to 40. However, in the example of this disclosure, only 35 modes are defined, and thus, the remaining modes are filled with 0, indicating filter 0 or an all-pass filter. The examples of the table above show the smoothing filters from the example set of filters (0, 1, 2) for each entry in the table. The intra-prediction mode to which smoothing filter 1 or 2 is applied corresponds to the combination defined in table 3 above.
In one example, during encoding, video encoder 20 may implement fixed-mode dependent intra-prediction coding, where video encoder 20 selects the most probable smoothing filter for each combination of block size and intra-prediction mode as defined by the above table. In this example, video encoder 20 may indicate that fixed mode intra-prediction coding is relied upon, without having to encode which smoothing filter is used for the block. During decoding, if fixed mode dependent intra-prediction coding is indicated, a decoder (e.g., video decoder 30) may determine which smoothing filter to apply using information from the above table based on the block size and intra-prediction mode information used as an index to the table. For example, an 8 × 8 block encoded using intra prediction mode 6 may correspond to filter 1([3,6]), which is smoothing filter 2. Filter 1 may indicate that the information is obtained from the table corresponding to the most probable filter table.
In some cases, using fixed mode dependent intra-prediction coding may not yield the best results for some of the larger block sizes (e.g., 64 × 64 and 128 × 128). In one example, during intra-prediction encoding of a block, video encoder 20 may use adaptive mode-dependent intra-prediction coding to determine whether to apply a smoothing filter to the prediction samples. For blocks smaller than a threshold size (e.g., 64 x 64), video encoder 20 may utilize the most probable filter table shown above. For blocks greater than or equal to the threshold, such as 64 x 64 blocks and larger blocks, video encoder 20 may evaluate the results of applying different smoothing filters, including no filter, to the prediction samples and determine which one yields better results. The result may be determined based on a quality metric or a rate-distortion metric. The rate portion of the metric may indicate a coding rate of the current block, and the distortion portion of the metric may indicate an amount of distortion in the current block. The rate-distortion metric may be based on rate, distortion, or a combination of both.
In one example, the evaluation of applying different smoothing filters may be based on the coding rates associated with applying each of the filters to the current block. The coding rate may be proportional to an amount of residual data corresponding to the current block when the smoothing filter is used. The larger the amount of residual data, the higher the coding rate for the current block. A smoothing filter that produces a smaller amount of residual data, and therefore a better coding rate, may be considered as a filter that yields better results.
In another example, the evaluation of applying different smoothing filters may be based on an amount of distortion associated with applying each of the filters to the current block. In one example, the amount of distortion may be determined based on a mean square error value between the prediction block and the original block. The prediction block may be a result of using prediction samples obtained by applying the selected smoothing filter. A smoothing filter that produces a smaller amount of mean squared error values, and therefore less distortion, may be considered a filter that yields better results.
In one example, video encoder 20 may compare the results of applying the filters specified by the most probable filter table shown above to the filters specified by a second most probable filter table (shown below), which is represented in pseudo code:
in some examples, more tables may be used to specify other filters. In a simple example using a most probable filter table and a second most probable filter table, the entries in the first table corresponding to non-zero filters (e.g., filters 1 and 2) may be replaced with filter 0 in the second table, and vice versa. In this way, the comparison between the most probable filter and the second most probable filter is between filtering and non-filtering. If the filtering option that yields better results is an option specified in the most probable filter table, video encoder 20 may not need to indicate anything because if not otherwise specified, the decoder assumes that the most probable filter is used. If the filtering option that yields better results is an option specified in the second most probable filter table, video encoder 20 may signal a filtering mode to indicate that a different filter than the most probable filter should be applied during decoding. The filter table may be stored in a memory unit (not shown) associated with video encoder 20 and accessed by intra-prediction unit 46 during coding.
The signaling of the filtering mode can be done in several ways. In one example, video encoder 20 may use a filter mode flag to signal whether a different filter than specified in the most probable filter table is used for a block. In examples that use more than two filter tables, such as where there is a third most probable filter table (or more), video encoder 20 may indicate which of the second or third most probable filter tables is used for a block. In yet another example, the signaling may be based on a combination of context information, such as a block size of the current block, a prediction mode, and a filtering mode of a neighboring block. The context information may be used to create a VLC table or CABAC table that entropy codes a signal regarding whether filtering was used for the current block. In this way, the decoder may determine whether filtering should be used for the block based on the context information and use the VLC table or CABAC information to determine the filtering mode.
In other examples, other techniques may be utilized to determine whether to apply the smoothing filter. The smoothing filter techniques of this disclosure may provide an intra smoothing filter mode decision (whether or not to filter) based on statistics of the bitrate distortion metric as a function of the combination of intra prediction mode and block size.
Referring again to fig. 2B, which illustrates a combination of intra prediction modes and PU sizes, which may be candidates for intra smoothing filtering based on a comparison between statistics of filtered and unfiltered intra smoothing modes. For example, the statistics indicate that the vertical (mode 0) and horizontal (mode 1) prediction directions are candidates for intra smoothing filtering for PU size 32 × 32, and the primary diagonal intra prediction directions (modes 3,6, 9) are candidates for intra smoothing filtering for PU sizes 32 × 32, 16 × 16, 8 × 8, and 4 × 4. The statistics may also indicate an intra-prediction direction between horizontal and vertical intra-prediction directions, and the primary diagonal direction is a candidate for intra smooth filtering at progressively smaller PU sizes. For DC mode, the statistics may indicate that mode 2(DC mode) is not a candidate for intra smoothing filtering, while planar mode (mode 34) is a candidate for intra smoothing filtering at certain PU sizes, such as 8 × 8, 16 × 16, and 32 × 32.
Although the statistics may indicate what intra prediction modes and what PU sizes are candidates for intra smoothing filtering, coding efficiency experiments may prove that the intra smoothing filtering candidate assignment shown in fig. 2B may not be optimal in a rate-distortion sense. Thus, the intra smoothing candidate of fig. 2B may be a "template mode" for deriving a "subset mode" with higher coding efficiency. In one example, a "subset mode" is defined by selecting candidates for intra smoothing filtering from a "template mode" according to intra prediction modes and according to PU sizes, or no intra smoothing filtering is used for any PU size corresponding to a certain intra prediction mode. Tables 4-7 below provide some examples of subset modes. Tables 4-7 may be generally referred to as filter decision tables.
In each of tables 4-7, a "1" may indicate that the intra smoothing filtering is applied to the corresponding intra prediction mode and block size. Thus, a "1" in a particular column and row position indicates that intra smoothing is applied with the intra prediction mode and block size mapped to that column and row position, respectively. In one example, table 4 may correspond to a "template mode," and each of tables 5-7 may correspond to a different "subset mode. In another example, table 7 may correspond to a "template mode," and each of tables 4-6 may correspond to a different "subset mode. The "template mode" may correspond to FIG. 2B, and is illustrated by Table 4, e.g., Table 4 also shows 64 × 64PU sizes for generalization. Tables 5 to 7 may represent "subset mode". In one example of this disclosure, table 7 may represent a "subset mode" that is suitable for general video sequences based on coding efficiency experiments.
TABLE 4
TABLE 5
TABLE 6
TABLE 7
After prediction unit 41 generates a prediction block or PU for the current video block (inter or intra coded), video encoder 20 forms a residual video block by subtracting the prediction block from the current video block. Summer 50 represents the component that performs this subtraction operation. Transform unit 52 may form one or more Transform Units (TUs) from the residual blocks. Transform unit 52 applies a transform, such as a Discrete Cosine Transform (DCT) or a conceptually similar transform, to the TU, producing a video block that includes residual transform coefficients. The transform may convert the residual block from a pixel domain to a transform domain, such as the frequency domain.
Transform unit 52 may perform other transforms, such as those defined by HM, which are conceptually similar to DCT. Wavelet transforms, integer transforms, sub-band transforms, or other types of transforms may also be used. In any case, transform unit 52 applies the transform to the residual block, producing a block of residual transform coefficients. The transform may convert the residual information from a pixel value domain to a transform domain (e.g., frequency domain).
Transform unit 52 may send the resulting transform coefficients to quantization unit 54. Quantization unit 54 quantizes the transform coefficients to further reduce the bit rate. The quantization process may reduce the bit depth associated with some or all of the coefficients. The quantization level may be modified by adjusting a quantization parameter. In some examples, quantization unit 54 may then perform a scan of a matrix that includes quantized transform coefficients. Alternatively, entropy encoding unit 56 may perform scanning.
After quantization, entropy coding unit 56 entropy codes the quantized transform coefficients. For example, entropy coding unit 56 may perform Context Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), or another entropy coding technique. After entropy coding by entropy coding unit 56, the encoded video may be transmitted to another device (e.g., video decoder 30), or archived for later transmission or retrieval.
In some cases, entropy coding unit 56 or another unit of video encoder 20 may be configured to perform other coding functions in addition to entropy coding. For example, entropy coding unit 56 may be configured to determine CBP values for macroblocks and partitions. Also, in some cases, entropy coding unit 56 may perform run-length coding of coefficients in its macroblocks or partitions. In particular, entropy coding unit 56 may apply zig-zag scanning or other scanning modes to scan the transform coefficients in a macroblock or partition, and encode runs of zeros for further compression. Entropy coding unit 56 may also construct header information that includes appropriate syntax elements for transmission in the encoded video bitstream. To entropy encode the syntax element, entropy coding unit 56 may perform CABAC and binarize the syntax element into one or more bins based on the context model. The entropy encoding unit may also perform CAVLC and encode the syntax elements into codewords according to the context-based probability.
In accordance with the techniques of this disclosure, entropy coding unit 56 may be configured to encode an indication that a smoothing filter different from the most probable filter is applied to the prediction samples during intra-prediction coding. When the filtering decision for the block is the same as the filtering decision specified in the most probable filter table, video encoder 20 may not send an indication to the decoder, and thus may not need to encode additional information. When an indication is needed, such as when a filtering decision for a block is specified in the second (or more) most likely filter table, video encoder 20 may encode an indication regarding filtering. In addition, video encoder 20 may encode the intra-prediction mode used to encode the block. Entropy coding unit 56 may encode the indication (e.g., intra-prediction mode and filtering decisions, when applicable) using CABAC. Furthermore, entropy coding unit 56 may use the context information for performing CABAC to encode an indication that indicates an intra-prediction mode for the block when the second (or more) most probable filter is applied to the prediction samples, as well as filtering information.
Inverse quantization unit 58 and inverse transform unit 60 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain for later use as a reference block for the reference frame. Motion compensation unit 44 may calculate the reference block by adding the residual block to a prediction block of one of the frames of reference frame store 64. Motion compensation unit 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate sub-integer pixel values for use in motion estimation. Summer 62 adds the reconstructed residual block to the motion compensated prediction block produced by motion compensation unit 44 to produce a reconstructed video block for storage in reference frame store 64. The reconstructed video block may be used by motion estimation unit 42 and motion compensation unit 44 as a reference block to inter-predict a block in a subsequent video frame.
Fig. 5 is a block diagram illustrating an example of video decoder 30 that decodes an encoded video sequence. In the example of fig. 5, video decoder 30 includes an entropy decoding unit 70, a prediction unit 71, an inverse quantization unit 76, an inverse transform unit 78, a reference frame store 82, and a summer 80. Prediction unit 71 includes motion compensation unit 72 and intra prediction unit 74. Video decoder 30 may, in some examples, perform a decoding pass that is generally reciprocal to the encoding pass described with respect to video encoder 20 (fig. 4). Prediction unit 71 may generate prediction data based on information received from entropy decoding unit 70. When inter-prediction coding is indicated, motion compensation unit 72 may generate prediction data based on the motion vectors received from entropy decoding unit 70. When intra-prediction coding is indicated, intra-prediction unit 74 may generate prediction data for a current block of a current frame based on the indicated intra-prediction mode and data from previously decoded blocks of the current frame.
Video decoder 30 may receive the encoded video bitstream, for example, over a network, broadcast, or from a physical medium. The encoded bitstream may include entropy coded video data. Entropy decoding unit 70 of video decoder 30 entropy decodes the bitstream to generate quantized coefficients, motion vectors, and prediction data. Entropy decoding unit 70 forwards the motion vectors and prediction data to prediction unit 71.
In accordance with the techniques of this disclosure, entropy coded video data may include an entropy coded indication of an intra-prediction mode used when coding the current block. The entropy coded video data may also include an indication of filtered entropy coded, indicating a smoothing filter decision. In one example, a filtering indication may be used to indicate that the most probable filter table was not used to select a smoothing filter during intra-prediction coding of the current block. When the entropy coded video data does not include a filtering indication, no indication implies that the most probable filter table was used to select a smoothing filter during intra-prediction coding of the current block. The entropy coded video data may also include block size information that entropy decoding unit 70 may use as context information along with the intra-prediction mode when entropy decoding the filtering indication for the intra-prediction coded block. In this way, entropy decoding unit 70 may decode the filtering indication (e.g., whether to apply a smoothing filter) and send the indication to intra-prediction unit 74.
Motion compensation unit 72 may use the motion vectors received in the bitstream to identify a prediction block in a reference frame in reference frame store 82. Intra-prediction unit 74 may use an intra-prediction mode received in the bitstream to form a prediction block from spatially neighboring blocks. Inverse quantization unit 76 inverse quantizes (i.e., dequantizes) the quantized block coefficients provided in the bitstream and decoded by entropy decoding unit 70. The inverse quantization process may comprise, for example, a conventional process as defined by the h.264 decoding standard. The inverse quantization process may also include, for each video block or CU, using a quantization parameter QPY calculated by video encoder 20 to determine a degree of quantization and likewise a degree to which inverse quantization should be applied. Inverse transform unit 78 applies an inverse transform (e.g., an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process) to the transform coefficients in order to generate residual blocks in the pixel domain.
Motion compensation unit 72 may also perform interpolation based on the interpolation filters. Motion compensation unit 72 may calculate interpolated values for sub-integer pixels of the reference block using interpolation filters as used by video encoder 20 during encoding of the video block. Motion compensation unit 72 may determine, from the received syntax elements, the interpolation filters used by video encoder 20 and use the interpolation filters to generate the prediction blocks.
Motion compensation unit 72 uses some of the syntax information to determine the size of LCUs and CUs used to encode frames and/or slices of the encoded video sequence, partition information that describes how each macroblock of a frame of the encoded video sequence is partitioned, a mode that indicates how each partition is encoded, one or more reference frames (and reference frame lists) for each inter-coded CU, and other information used to decode the encoded video sequence.
Intra-prediction unit 74 may also determine whether to apply a smoothing filter to the prediction samples when performing the prediction block. Intra-prediction unit 74 may make a determination as to whether to apply a smoothing filter based on the block size and the corresponding intra-prediction mode. If no filtering indication is signaled, intra-prediction unit 74 may determine filtering based on what is specified in the most probable filter table (shown above) for the block size and intra-prediction mode combination. If a filtering indication is signaled, intra-prediction unit 74 may determine that the filtering decision is not based on the most probable filter table and may determine filtering based on what is specified in a second most probable filter table (shown above) for the block size and intra-prediction mode combination. In one example, more than one additional filter table may be provided, such as a second, third, or more most probable filter tables. In this example, the filtering indication may also indicate which of the tables should be used to determine which smoothing filter to apply. The filter table may be stored in a memory unit (not shown) in video decoder 30. Intra-prediction unit 74 may access the filter table from memory during decoding.
Summer 80 sums the residual block with the corresponding prediction block generated by motion compensation unit 72 or intra-prediction unit 74 to form a decoded block. A deblocking filter may also be applied to filter the decoded blocks, if desired, in order to remove blocking artifacts. The decoded video blocks are then stored in reference frame store 82, reference frame store 82 providing reference blocks for subsequent motion compensation and also generating decoded video for presentation on a display device (e.g., display device 32 of fig. 1).
Fig. 6 is a flow diagram illustrating an example method for encoding video data in accordance with the techniques of this disclosure. Although described with respect to video encoder 20 (fig. 1 and 4) for purposes of example, it should be understood that the techniques of fig. 6 may be performed by any video coding unit. Moreover, the steps illustrated in fig. 6 may be performed in a different order, certain steps may be omitted, and additional steps may be added, without departing from the techniques of this disclosure.
Video encoder 20 may receive a block (e.g., a CU) of video data to be encoded. When the current block is designated as a block to be encoded using intra-prediction coding, video encoder 20 may determine whether to apply a smoothing filter to the prediction block, and if a smoothing filter is to be applied, determine the smoothing filter. In some examples, a smoothing filter may be applied to the prediction samples or predictors used to predict blocks in the same frame in intra-prediction coding. In some cases, applying a smoothing filter provides more accurate prediction and minimizes blocking in the reconstructed predicted frame.
Video encoder 20 may determine to operate with either fixed mode-dependent intra-prediction coding or adaptive mode-dependent intra-prediction coding (602). In one example, the determination may be made at the sequence or picture level based on content or based on block size. For example, for large blocks, intra-prediction coding that depends on an adaptive mode may be used, and for small blocks, intra-prediction coding that depends on a fixed mode may be used. Thus, when larger blocks are used, there are fewer blocks in the picture to encode, and thus a smaller amount of overhead information than when smaller blocks are used. However, fixed mode dependent intra-prediction coding adds little or no overhead information, while adaptive mode dependent intra-prediction coding requires overhead information. In this way, the selected mode may provide a tradeoff between overhead information and performance.
If video encoder 20 is to operate with fixed-mode dependent intra-prediction coding, video encoder 20 may determine whether to apply a smoothing filter based on a specified filter in a filter table (e.g., a most probable filter table) that is selected based on a block size and intra-prediction mode combination, and apply the specified filter (604).
In one example, for each of the available intra-prediction modes (e.g., the 35 modes defined by the HM), whether to apply a smoothing filter depends on the block size. Table 3 above shows the block size and intra prediction mode combinations to which the smoothing filter is applied. A most probable filter table (shown above) may be generated based on the block size and intra-prediction mode combination. Each of the entries of the table may indicate a filter that should be applied to a corresponding index, where the index is associated with a block size and an intra-prediction mode. Each of the filters specified in the table may be from a set of filters, where the set of filters may vary between streams, between groups of frames, between frames or slices, and so on. For simplicity, an all-pass filter may be identified for entries that do not require the application of a smoothing filter.
If video encoder 20 is to operate in an adaptive mode-dependent intra-prediction coding operation, video encoder 20 may determine whether the block size is equal to or above a threshold block size, e.g., 64 × 64 (606). If the block size is less than the threshold block size, video encoder 20 may apply the filter specified by the most probable filter table and determine whether to apply a smoothing filter based on the block size and the intra-prediction mode (604). If the block size is equal to or greater than the threshold block size, video encoder 20 may determine whether applying a filter according to the most probable filter table yields better results than applying a filter according to the second most probable filter table (608). Thus, video encoder 20 may apply the corresponding filter from the first table to the block, apply the corresponding filter from the second table to the block, and evaluate the results of applying the two filters by comparing the results to each other to determine which yields a better result.
In one example, for each block size and intra-prediction mode combination, applying a filter according to one of the tables may be the same as applying no filter at all, or in this example, the same as applying an all-pass filter. Thus, in practice, a comparison of applying a filter according to the most probable filter table with applying a filter according to the second most probable filter table may be the same as a result of comparing applied or not applied filters.
If applying a filter according to the most probable filter table yields better results, video encoder 20 may determine and apply the filter specified by the most probable filter table based on the block size associated with the current block in combination with the intra-prediction mode (604). If applying a filter according to the second most probable filter table yields better results, video encoder 20 may determine and apply the filter specified by the second most probable filter table based on the block size associated with the current block in combination with the intra-prediction mode (610). In some examples, when there are more than 2 most likely filter tables (e.g., a third or more), the comparison between the results of applying all possible filters may be compared to determine which filter table will be used to select a smoothing filter to apply to the predicted samples. In some examples, multiple filtering decision tables may be used, such as tables 4-7 above, where one table may be considered a "template mode" and the remaining tables may be considered alternative tables or "subset modes" that provide alternative filtering options, as described above.
Video encoder 20 may then encode the current block by applying the selected smoothing filter to the prediction samples and performing intra-prediction coding according to the intra-prediction mode associated with the current block (612). Video encoder 20 may also encode syntax elements associated with intra-prediction coding with the current block (614), such as one or more syntax elements that identify an intra-prediction mode. Other examples of syntax elements associated with intra-prediction coding that may be encoded with a block may be partition types associated with the block, PU size, TU size, CBP, and quantized coefficients. If the applied smoothing filter decision is not based on the most probable filter table or "template mode," e.g., a second most probable filter table or other "subset mode" was used to make the filtering decision, video encoder 20 may also encode a syntax element that indicates that the most probable filter table was not used to signal that the block was not encoded using fixed-mode dependent intra-prediction coding. For example, if table 7 above is "template mode" and tables 4-6 are "subset mode," when a "subset mode" table is used instead of a "template mode" table, the video encoder may encode a syntax element indicating the "subset mode" table used. Video encoder 20 may then transmit the encoded video blocks within the encoded video frame and having the syntax elements to a video decoder (616).
Fig. 7 is a flow diagram illustrating an example method for decoding video data in accordance with the techniques of this disclosure. Although described with respect to video decoder 30 (fig. 1 and 5) for purposes of example, it should be understood that the techniques of fig. 7 may be performed by any video coding unit. Moreover, the steps illustrated in fig. 7 may be performed in a different order, certain steps may be omitted, and additional steps may be added, without departing from the techniques of this disclosure.
Video decoder 30 may receive an encoded block (e.g., CU) of video data to be decoded (702). Video decoder 30 may decode the encoded video block and the syntax element to determine that the block was encoded using intra-prediction coding (704). Based on the syntax element, video decoder 30 may also determine an intra-prediction mode for the block (706). Video decoder 30 may determine whether the syntax element includes a filtering indication (708).
If no filtering indication is present, video decoder 30 may then decode the block using intra-prediction coding and apply a smoothing filter option as specified by the most probable filter table (710). If there is a filtering indication, video decoder 30 may then decode the block using intra-prediction coding and apply a smoothing filter option as specified by the second most probable filter table (712). In either case, whether the most probable filter table or the second most probable filter table is used, video decoder 30 may determine the filter based on the block size associated with the current block in combination with the intra-prediction mode. In some examples, multiple tables may define the second (or more) possible filters, or there may be multiple "subset mode" filters, and the filtering indication may indicate a "subset mode" table that is used instead for the most likely filter table or the "template mode" table.
In this way, video decoder 30 may not need to make any determination as to which filtering options provide better results. If the encoder does not signal any indication, video decoder 30 assumes that the most probable filter table or "template mode" table was used, otherwise, either type of signaling indicates that the second most probable filter table was used. In some examples using more than two filter tables (e.g., a third (or more) most likely filter table or "subset mode"), the filtering indication may identify which of the additional tables was used in selecting the smoothing filter.
As discussed above, for example, intra smoothing filtering may be used for larger PU sizes, as illustrated by fig. 2B and table 4. Intra smoothing filtering may also be used for most PU sizes for intra prediction modes or modes 3,6, 9 corresponding to true diagonal directions (i.e., +/-45 and +/-135 degrees), while intra smoothing filtering may not be used for smaller PU sizes in the case of vertical (mode 0) and horizontal (mode 1) directions. For intra-prediction directions between the diagonal direction and the horizontal or vertical direction, respectively, intra-prediction filtering may be less likely to be used for smaller PU sizes when the direction is near the horizontal or vertical direction, and more likely to be used for smaller PU sizes when the direction is near the diagonal direction, as illustrated in fig. 2B and table 4. In one example, an encoder and decoder in accordance with the techniques of this disclosure may apply intra smoothing filtering in accordance with the "subset mode" specified by table 7 shown above. In one example, for DC mode (mode 2), intra smoothing filtering may not be used for any PU sizes. Also, for planar mode (mode 34), intra smoothing filtering may be utilized in conjunction with larger PU sizes.
The encoder and decoder may apply intra smoothing filtering according to a "subset mode" derived from the "template mode" (table 4), which specifies whether to perform intra smoothing filtering for each PU size in combination with the intra prediction mode. A "subset mode" is derived by selecting candidates for intra smoothing filtering from a "template mode" according to an intra prediction mode and according to PU size, or no intra smoothing filtering is used for any PU size corresponding to the intra prediction mode. The "subset mode" derived from the "template mode" may have certain characteristics, such as no intra smoothing filtering is applied, at least for one of the PU sizes, for each intra prediction mode, so the encoder has the option to perform intra prediction without intra smoothing. The "subset mode" may also have certain characteristics, such as, for each intra-prediction mode, if intra-smoothing filtering is not applied at the size of a certain PU, then intra-smoothing filtering may not be applied at any smaller PU size for that intra-prediction mode.
The calculation of prediction samples for each intra prediction mode and PU size combination may be subject to rounding rules, such as rounding down, rounding up, or rounding to the nearest integer. The rounding rules may be modified in the encoder and decoder in a deterministic manner based on the prediction sample position, e.g., based on the intra-prediction mode and PU size, e.g., interleaving the prediction samples with different rounding rules, in order to increase coding efficiency. In an example of intra smoothing filtering of reference samples for intra prediction, the intra smoothing filtering may be skipped for certain reference sample positions (e.g., upper left corner reference samples or other reference sample positions) in a deterministic manner (e.g., based on intra prediction mode and PU size).
In one example, an encoder and/or decoder may store a "subset mode" as a lookup table (LUT), and access this LUT to determine whether to apply intra smoothing filtering on PU sizes associated with a current block in combination with intra prediction modes. In another example, an encoder and/or decoder may represent a "subset mode" by a set of conditions for determining whether to apply intra smoothing filtering to: PU size and intra prediction mode. In another example, the encoder and/or decoder may represent a "subset mode" by calculating one or more mathematical equations of whether to apply intra smoothing filtering to: PU size and intra prediction mode. In another example, the encoder and/or decoder may represent a "subset mode" by a combination of LUTs, sets of conditions, and/or sets of mathematical equations.
In one example, the "template mode" (e.g., table 4) or "subset mode" (e.g., tables 5-7) may be extended to include a variety of intra smoothing filter types that may be used for each PU size and intra prediction mode combination. For example, for each combination in which intra smoothing filtering is applied, the number in the table may indicate the type of filter to be used. In one example, two intra smoothing filters may be defined and represented by "1" and "2". An entry in the table corresponding to a certain row/column position may have no entry, i.e., no filter is applied, or a "1" or "2" to indicate that an intra smoothing filter is applied and an intra smoothing filter is used, i.e., filter "1" or filter "2".
In some examples, HEVC may support short-range intra prediction (SDIP) PUs, which may have various rectangle sizes according to a Coding Unit (CU). In one example, the techniques of this disclosure may be applicable to square and rectangular PUs, such as PUs used in short-range intra prediction (SDIP). Table 8 below illustrates an example of rectangular PU sizes (in pixels) according to the CU size of SDIP.
Size of CU Possible PU size
CU32×32 8×32、32×8
CU16×16 4×16、16×4、1×16、16×1
CU8×8 2×8、8×2
TABLE 8
In addition to the techniques discussed above, the techniques of this disclosure may include techniques to improve the application of MDIS to rectangular SDIP partitions. For example, applying intra smoothing filtering may depend on the width and/or height of the rectangular SDIP partition. The size of the SDIP partition may be determined using a technique of determining the size of the partition or a size determination rule. For example, such techniques for determining the size of SDIP partitions may include determining a minimum or maximum size of a partition, calculating a mathematical equation for the size of a partition based on other known variables associated with the partition, or a conditional check that may be applied to the width and height of SDIP partitions to determine the size of a partition. In one example, the resulting determined size of the partition may be used to determine whether to apply intra smoothing filtering, such as using one of the filtering decision tables described above. For example, the techniques may utilize the size to determine whether to filter. If, in one illustrative example, the minimum size is 8 pixels, the filtering decision may be determined based on the table entries associated with the 32 x 32 blocks and the intra-prediction mode associated with the partition currently being processed.
In one example, applying the intra smoothing filter may be based on an intra prediction direction associated with the SDIP partition, and/or based on a width and/or height of the SDIP partition. For example, when the vertical prediction direction is selected, the height of the partition may be used in one of the above-described filtering decision tables, and in the case of the horizontal direction, the width may be used in one of the above-described filtering decision tables. In another example, horizontal, vertical, and diagonal directions may be considered separately to determine the dimensions to be used.
In one example, when planar mode is used for SDIP partitions, the size of the SDIP partitions may be determined utilizing a size determination technique as described above. The determined size may be used to determine whether to apply intra smoothing filtering, such as by accessing one of the filtering decision tables described above.
Fig. 8 is a flow diagram illustrating another example method for encoding video data in accordance with the techniques of this disclosure. Although described with respect to video encoder 20 for example purposes, it should be understood that the techniques of fig. 8 may be performed by any video coding unit. Moreover, the steps illustrated in fig. 8 may be performed in a different order, certain steps may be omitted, and additional steps may be added, without departing from the techniques of this disclosure.
Video encoder 20 may receive a block (e.g., a CU) of video data to be encoded. When a current block is designated as a current block to be encoded using intra-prediction coding, video encoder 20 may determine a current block size and an intra-prediction mode to encode the current block (802). As discussed above, the video encoder may then use a filter table (e.g., table 7 above) to determine and apply a smoothing filter based on the block size and the intra-prediction mode.
Video encoder 20 may then encode the current block by applying the selected smoothing filter to the prediction samples and performing intra-prediction coding according to the intra-prediction mode associated with the current block (612). Video encoder 20 may also encode syntax elements associated with intra-prediction coding for the current block (614), such as one or more syntax elements that identify an intra-prediction mode. Other examples of syntax elements associated with intra-prediction coding that may be encoded with a block may be partition types associated with the block, PU size, TU size, CBP, and quantized coefficients. Video encoder 20 may then transmit the encoded video blocks within the encoded video frame and having the syntax elements to a video decoder (616).
Fig. 9 is a flow diagram illustrating another example method for decoding video data in accordance with the techniques of this disclosure. Although described with respect to video decoder 30 for example purposes, it should be understood that the techniques of fig. 9 may be performed by any video coding unit. Moreover, the steps illustrated in fig. 9 may be performed in a different order, certain steps may be omitted, and additional steps may be added, without departing from the techniques of this disclosure.
Video decoder 30 may receive an encoded block (e.g., CU) of video data to be decoded (902). Video decoder 30 may decode the encoded video block and the syntax element to determine that the block was encoded using intra-prediction coding (904). Based on the syntax element, video decoder 30 may also determine an intra-prediction mode for the block (906). Video decoder 30 may then decode the block using intra-prediction coding and apply a smoothing filter selected from the same filter table used during encoding, such as table 7, based on the block size and intra-prediction mode (908). In this way, video decoder 30 may not need to make any determination regarding filtering options other than selecting a filter from a filter table.
In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, such as according to a communication protocol. In this manner, the computer-readable medium may generally correspond to (1) a tangible computer-readable storage medium that is not transitory, or (2) a communication medium such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. The computer program product may include a computer-readable medium.
By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The instructions may be executed by one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, Application Specific Integrated Circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Thus, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an Integrated Circuit (IC), or a set of ICs (e.g., a chipset). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit, or provided by a collection of interoperating hardware units (including one or more processors as described above) in conjunction with suitable software and/or firmware.

Claims (35)

1. A method of coding video data, the method comprising:
determining a block size associated with a current block of video data, the block size determined from a plurality of block sizes including a 4 x 4 block size, an 8 x 8 block size, a 16 x 16 block size, and a 32 x 32 block size;
determining an intra-prediction coding mode associated with the current block of video data, determining the intra-prediction coding mode from a plurality of intra-prediction coding modes, the plurality of intra-prediction coding modes including a horizontal mode, a vertical mode, a planar mode, a plurality of diagonal modes including an upper-left diagonal mode and an upper-right diagonal mode, a lower-left diagonal mode, and a DC mode, and wherein the plurality of diagonal modes includes a plurality of modes immediately adjacent to the horizontal mode or the vertical mode;
determining, based on the block size and an intra-prediction coding mode, a filtering decision whether to apply a filter to prediction samples used to predict the current block, the determining including:
if the block size is 4 x 4 block size, determining the filtering decision to be: for any of the plurality of intra-prediction coding modes, applying no filtering to prediction samples used to predict the current block,
if the block size is 8 x 8 block size, determining the filtering decision to be: filtering prediction samples used to predict the current block only when the coding mode is the planar mode, the top-left diagonal mode, the top-right diagonal mode, and the bottom-left diagonal mode,
if the block size is 16 x 16 block size, determining the filtering decision to be: filtering prediction samples used to predict the current block when the coding mode is the planar mode or when the coding mode is any other mode of the plurality of intra-prediction coding modes other than the horizontal mode, the vertical mode, the DC mode, and the mode immediately adjacent to the horizontal mode or vertical mode,
if the block size is 32 x 32 block size, the filtering decision is determined to be: filtering prediction samples used to predict the current block when the coding mode is planar mode or when the coding mode is any other mode of the plurality of intra-prediction coding modes other than the horizontal mode, the vertical mode, and the DC mode, and
intra-prediction coding the current block, wherein intra-prediction coding the current block comprises filtering or not filtering pixels used to predict the current block using the determined filtering decision.
2. The method of claim 1, wherein determining the filtering decision comprises selecting from a first filtering decision table based on the block size and the coding mode.
3. The method of claim 1, wherein determining the filtering decision comprises:
selecting from a first filtering decision table based on the block size and the coding mode when the block size is less than a threshold block size; and
when the block size is equal to or greater than a threshold block size, determining whether to select from the first or second filtering decision tables based on the block size and the coding mode.
4. The method of claim 3, further comprising:
selecting a filter from the first filter table or the second filter table;
filtering the pixels used to predict the current block using the selected filter;
evaluating a first result of application of a filter from the first filter table and a second result of application of a filter from the second filter table to determine which yields a better result; and
selecting the filter from the first filter table or the second filter table based on the determination.
5. The method of claim 4, further comprising encoding a filtering indication when the selected filter is selected from the second filter table.
6. The method of claim 1, wherein intra-prediction coding the current block comprises intra-prediction decoding the current block, wherein the current block comprises a block of video data encoded using intra-prediction coding, the method further comprising:
determining whether the encoded current block includes a filtering indication,
wherein determining the filtering decision comprises determining the filtering decision based on the block size, the coding mode, and an encoded filtering indication.
7. The method of claim 4, wherein the first filter table and the second filter table each comprise entries corresponding to different combinations of block sizes and coding modes, and wherein each of the entries comprises a filter selected from a set of filters.
8. The method of claim 7, wherein the set of filters comprises at least two filters, and wherein one of the filters comprises an all-pass filter.
9. The method of claim 1, further comprising:
determining whether to apply filtering to the current block based on the block size and the coding mode using a filtering decision table; and
if filtering is to be applied to the current block, the filter is selected from a filter table.
10. The method of claim 1, wherein the current block comprises a prediction unit coded according to short-range intra prediction (SDIP).
11. The method of claim 10, further comprising:
determining a size associated with the current block; and
determining the block size based on the size using a set of size determination rules.
12. An apparatus for coding video data, the apparatus comprising:
a video encoder configured to: determining a block size associated with a current block of video data, the block size determined from a plurality of block sizes including a 4 x 4 block size, an 8 x 8 block size, a 16 x 16 block size, and a 32 x 32 block size; determining an intra-prediction coding mode associated with the current block of video data, the intra-prediction coding mode determined from a plurality of intra-prediction coding modes, the plurality of intra-prediction coding modes including a horizontal mode, a vertical mode, a planar mode, a plurality of diagonal modes including a top-left diagonal mode and a top-right diagonal mode, a bottom-left diagonal mode, and a DC mode, and wherein the plurality of diagonal modes includes a plurality of modes immediately adjacent to the horizontal mode or the vertical mode; determining, based on the block size and an intra-prediction coding mode, a filtering decision whether to apply a filter to prediction samples used to predict the current block, the determining including: if the block size is 4 x 4 block size, determining the filtering decision to be: for any of the plurality of intra-prediction coding modes, applying no filtering to prediction samples used to predict the current block, if the block size is an 8 x 8 block size, determining the filtering decision to be: filtering prediction samples used to predict the current block only when the coding mode is the planar mode, the top-left diagonal mode, the top-right diagonal mode, and the bottom-left diagonal mode, the filtering decision determined to be if the block size is a 16 x 16 block size: filtering prediction samples used to predict the current block when the coding mode is the planar mode or when the coding mode is any other mode of the plurality of intra-prediction coding modes other than the horizontal mode, the vertical mode, the DC mode, and the mode immediately adjacent to the horizontal mode or vertical mode, the filtering decision determined to be, if the block size is 32 x 32 block size: filtering prediction samples used to predict the current block when the coding mode is planar mode or when the coding mode is any other mode of the plurality of intra-prediction coding modes other than the horizontal mode, the vertical mode, and the DC mode,
and intra-prediction coding the current block, wherein intra-prediction coding the current block comprises filtering or not filtering pixels used to predict the current block using the determined filtering decision.
13. The apparatus of claim 12, wherein to determine the filtering decision, the video encoder is configured to select from a first filtering decision table based on the block size and the coding mode.
14. The apparatus of claim 12, wherein to determine the filtering decision, the video encoder is further configured to:
selecting from a first filtering decision table based on the block size and the coding mode when the block size is less than a threshold block size; and
when the block size is equal to or greater than a threshold block size, determining whether to select from the first or second filtering decision tables based on the block size and the coding mode.
15. The apparatus of claim 14, the video encoder further configured to:
selecting a filter from the first filter table or the second filter table;
filtering the pixels used to predict the current block using the selected filter;
evaluating a first result of application of a filter from the first filter table and a second result of application of a filter from the second filter table to determine which yields a better result; and
selecting the filter from the first filter table or the second filter table based on the determination.
16. The apparatus of claim 15, wherein the video encoder is further configured to encode a filtering indication when the selected filter is selected from the second filter table.
17. The apparatus of claim 15, further comprising a memory configured to store the first filter table and the second filter table.
18. The apparatus of claim 12, wherein the apparatus comprises a video decoder, wherein the current block comprises a block of video data encoded using intra-prediction coding, the video encoder further configured to:
determining whether the encoded current block includes a filtering indication,
wherein to determine the filtering decision, the video encoder is further configured to determine the filtering decision based on the block size, the coding mode, and the encoded filtering indication.
19. The apparatus of claim 15, wherein the first filter table and the second filter table each comprise entries corresponding to different combinations of block sizes and coding modes, and wherein each of the entries comprises a filter selected from a set of filters.
20. The apparatus of claim 19, wherein the set of filters comprises at least two filters, and wherein one of the filters comprises an all-pass filter.
21. The apparatus of claim 12, wherein the video encoder is further configured to:
determining whether to apply filtering to the current block based on the block size and the coding mode using a filtering decision table; and
if filtering is to be applied to the current block, the filter is selected from a filter table.
22. The apparatus of claim 12, wherein the current block comprises a prediction unit coded according to short-range intra prediction (SDIP).
23. The apparatus of claim 22, wherein the video encoder is further configured to:
determining a size associated with the current block; and
determining the block size based on the size using a set of size determination rules.
24. An apparatus for encoding video data, the apparatus comprising:
means for determining a block size associated with a current block of video data, the block size determined from a plurality of block sizes including a 4 x 4 block size, an 8 x 8 block size, a 16 x 16 block size, and a 32 x 32 block size;
means for determining an intra-prediction coding mode associated with the current block of video data, the intra-prediction coding mode determined from a plurality of intra-prediction coding modes including a horizontal mode, a vertical mode, a planar mode, a plurality of diagonal modes including an upper-left diagonal mode and an upper-right diagonal mode, a lower-left diagonal mode, and a DC mode, and wherein the plurality of diagonal modes includes a plurality of modes immediately adjacent to the horizontal mode or the vertical mode;
a filtering decision for determining whether to apply a filter to prediction samples used to predict the current block based on the block size and an intra-prediction coding mode, the determination comprising:
if the block size is 4 x 4 block size, determining the filtering decision to be: for any of the plurality of intra-prediction coding modes, applying no filtering to prediction samples used to predict the current block,
if the block size is 8 x 8 block size, determining the filtering decision to be: filtering prediction samples used to predict the current block only when the coding mode is the planar mode, the top-left diagonal mode, the top-right diagonal mode, and the bottom-left diagonal mode,
if the block size is 16 x 16 block size, determining the filtering decision to be: filtering prediction samples used to predict the current block when the coding mode is the planar mode or when the coding mode is any other mode of the plurality of intra-prediction coding modes other than the horizontal mode, the vertical mode, the DC mode, and the mode immediately adjacent to the horizontal mode or vertical mode,
if the block size is 32 x 32 block size, the filtering decision is determined to be: filtering prediction samples used to predict the current block when the coding mode is planar mode or when the coding mode is any other mode of the plurality of intra-prediction coding modes other than the horizontal mode, the vertical mode, and the DC mode; and
means for intra-prediction coding the current block, wherein intra-prediction coding the current block comprises filtering or not filtering pixels used to predict the current block using the determined filtering decision.
25. The apparatus of claim 24, wherein the means for determining the filtering decision comprises means for selecting from a first filtering decision table based on the block size and the coding mode.
26. The apparatus of claim 24, wherein the means for determining the filtering decision comprises:
means for selecting from a first filtering decision table based on the block size and the coding mode when the block size is less than a threshold block size; and
means for determining whether to select from the first or second filtering decision tables based on the block size and the coding mode when the block size is equal to or greater than a threshold block size.
27. The apparatus of claim 26, further comprising:
means for selecting a filter from the first filter table or the second filter table;
means for filtering the pixels used to predict the current block using the selected filter;
means for evaluating a first result of application of a filter from the first filter table and a second result of application of a filter from the second filter table to determine which yields a better result; and
means for selecting the filter from the first filter table or the second filter table based on the determination.
28. The apparatus of claim 27, further comprising means for encoding a filtering indication when the selected filter is selected from the second filter table.
29. The apparatus of claim 27, further comprising means for storing the first filter table and the second filter table.
30. The apparatus of claim 24, wherein the apparatus comprises a video decoder, wherein the current block comprises a block of video data encoded using intra-prediction coding, the apparatus further comprising:
means for determining whether the encoded current block includes a filtering indication,
wherein the means for determining the filtering decision comprises means for determining the filtering decision based on the block size, the coding mode, and an encoded filtering indication.
31. The apparatus of claim 27, wherein the first filter table and the second filter table each comprise entries corresponding to different combinations of block sizes and coding modes, and wherein each of the entries comprises a filter selected from a set of filters.
32. The apparatus of claim 31, wherein the set of filters comprises at least two filters, and wherein one of the filters comprises an all-pass filter.
33. The apparatus of claim 24, further comprising:
means for determining whether to apply filtering to the current block based on the block size and the coding mode using a filtering decision table; and
means for selecting the filter from a filter table if filtering is to be applied to the current block.
34. The apparatus of claim 24, wherein the current block comprises a prediction unit coded according to short-range intra prediction (SDIP).
35. The apparatus of claim 34, further comprising:
means for determining a size associated with the current block; and
means for determining the block size based on the size using a set of sizing rules.
HK13109524.1A 2010-10-01 2011-09-30 Intra smoothing filter for video coding HK1182249B (en)

Applications Claiming Priority (17)

Application Number Priority Date Filing Date Title
US38917310P 2010-10-01 2010-10-01
US61/389,173 2010-10-01
US40604610P 2010-10-22 2010-10-22
US61/406,046 2010-10-22
US201061426388P 2010-12-22 2010-12-22
US61/426,388 2010-12-22
US201061426964P 2010-12-23 2010-12-23
US61/426,964 2010-12-23
US201161492335P 2011-06-01 2011-06-01
US61/492,335 2011-06-01
US201161499579P 2011-06-21 2011-06-21
US61/499,579 2011-06-21
US201161500095P 2011-06-22 2011-06-22
US61/500,095 2011-06-22
US13/248,968 US9008175B2 (en) 2010-10-01 2011-09-29 Intra smoothing filter for video coding
US13/248,968 2011-09-29
PCT/US2011/054130 WO2012044886A1 (en) 2010-10-01 2011-09-30 Intra smoothing filter for video coding

Publications (2)

Publication Number Publication Date
HK1182249A1 HK1182249A1 (en) 2013-11-22
HK1182249B true HK1182249B (en) 2017-08-11

Family

ID=

Similar Documents

Publication Publication Date Title
CA2812307C (en) Intra smoothing filter for video coding
CN110999291B (en) Systems and methods for partitioning video blocks in inter-predicted segments of video data
EP2628300B1 (en) Adaptive motion vector resolution signaling for video coding
EP2727353B1 (en) Video coding using adaptive motion vector resolution
KR101618021B1 (en) Video coding using a subset of intra prediction modes and corresponding directional transforms
CN106165419B (en) Adaptive Motion Vector Resolution Signaling for Video Coding
US9467692B2 (en) Intra prediction improvements for scalable video coding
EP2781094B1 (en) Reference mode selection in intra mode coding
US20130114708A1 (en) Secondary boundary filtering for video coding
CN113612997A (en) Improved video intra prediction using position dependent prediction combining for video coding
EP2756676A2 (en) Non-square transform units and prediction units in video coding
WO2015006662A2 (en) Adaptive filtering in video coding
KR20140016983A (en) Run-mode based coefficient coding for video coding
HK1182249B (en) Intra smoothing filter for video coding
HK1190018B (en) Enhanced intra-prediction mode signaling for video coding using neighboring mode
HK1190018A (en) Enhanced intra-prediction mode signaling for video coding using neighboring mode