WO2024153093A1

WO2024153093A1 - Method and apparatus of combined intra block copy prediction and syntax design for video coding

Info

Publication number: WO2024153093A1
Application number: PCT/CN2024/072614
Authority: WO
Inventors: Chen-Yen LAI; Man-Shu CHIANG; Yu-Cheng Lin; Chih-Hsuan Lo; Ching-Yeh Chen; Tzu-Der Chuang; Chih-Wei Hsu; Yi-Wen Chen; Yu-Wen Huang
Original assignee: MediaTek Inc
Current assignee: MediaTek Inc
Priority date: 2023-01-17
Filing date: 2024-01-16
Publication date: 2024-07-25
Anticipated expiration: 2025-07-17
Also published as: EP4652738A1; CN120548707A

Abstract

Method and apparatus of IBC coding and signalling for video coding. According to the method, the current block is partitioned into a first sub-partition and a second sub-partition. First prediction is derived for the first sub-partition, wherein the first prediction comprises an IBC-based (Intra Block Copy based) predictor. Second prediction is derived for the second sub-partition, wherein the second prediction comprises an inter predictor. The first sub-partition and the second sub-partition of the current block are encoded or decoded using information comprising the first prediction and the second prediction respectively. According to another method, a prediction candidate list comprising one or more IBC-based (Intra Block Copy based) prediction candidates and one or more intra prediction candidates is derived. In yet another method, a transform syntax for IBC coded block is signalled in a same way as an inter-coded block.

Description

METHOD AND APPARATUS OF COMBINED INTRA BLOCK COPY PREDICTION AND SYNTAX DESIGN FOR VIDEO CODING

CROSS REFERENCE TO RELATED APPLICATIONS

The present invention is a non-Provisional Application of and claims priority to U.S. Provisional Patent Application No. 63/480, 110, filed on January 17, 2023, U.S. Provisional Patent Application No. 63/495,588, filed on April 12, 2023, and U.S. Provisional Patent Application No. 63/500,308, filed on May 5, 2023. The U.S. Provisional Patent Applications are hereby incorporated by reference in their entireties.

BACKGROUND AND RELATED ART

Versatile video coding (VVC) is the latest international video coding standard developed by the Joint Video Experts Team (JVET) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) . The standard has been published as an ISO standard: ISO/IEC 23090-3: 2021, Information technology -Coded representation of immersive media -Part 3: Versatile video coding, published Feb. 2021. VVC is developed based on its predecessor HEVC (High Efficiency Video Coding) by adding more coding tools to improve coding efficiency and also to handle various types of video sources including 3-dimensional (3D) video signals.

Fig. 1A illustrates an exemplary adaptive Inter/Intra video encoding system incorporating loop processing. For Intra Prediction 110, the prediction data is derived based on previously encoded video data in the current picture. For Inter Prediction 112, Motion Estimation (ME) is performed at the encoder side and Motion Compensation (MC) is performed based on the result of ME to provide prediction data derived from other picture (s) and motion data. Switch 114 selects Intra Prediction 110 or Inter-Prediction 112, and the selected prediction data is supplied to Adder 116 to form prediction errors, also called residues. The prediction error is then processed by Transform (T) 118 followed by Quantization (Q) 120. The transformed and quantized residues are then coded by Entropy Encoder 122 to be included in a video bitstream corresponding to the compressed video data. The bitstream associated with the transform coefficients is then packed with side information such as motion and coding modes associated with Intra prediction and Inter prediction, and other information such as parameters associated with loop filters applied to underlying image area. The side information associated with Intra Prediction 110, Inter prediction 112 and in-loop filter 130, are provided to Entropy Encoder 122 as shown in Fig. 1A. When an Inter-prediction mode is used, a reference picture or pictures have to be reconstructed at the encoder end as well. Consequently, the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the residues. The residues are then added back to prediction data 136 at Reconstruction (REC) 128 to reconstruct video data. The reconstructed video data may be stored in Reference Picture Buffer 134 and used for prediction of other frames.

As shown in Fig. 1A, incoming video data undergoes a series of processing in the encoding system. The reconstructed video data from REC 128 may be subject to various impairments due to a series of processing. Accordingly, in-loop filter 130 is often applied to the reconstructed video data before the reconstructed video data are stored in the Reference Picture Buffer 134 in order to improve video quality. For example, deblocking filter (DF) , Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) may be used. The loop filter information may need to be incorporated in the bitstream so that a decoder can properly recover the required information. Therefore, loop filter information is also provided to Entropy Encoder 122 for incorporation into the bitstream. In Fig. 1A, Loop filter 130 is applied to the reconstructed video before the reconstructed samples are stored in the reference picture buffer 134. The system in Fig. 1A is intended to illustrate an exemplary structure of a typical video encoder. It may correspond to the High Efficiency Video Coding (HEVC) system, VP8, VP9, H. 264 or VVC.

The decoder, as shown in Fig. 1B, can use similar or portion of the same functional blocks as the encoder except for Transform 118 and Quantization 120 since the decoder only needs Inverse Quantization 124 and Inverse Transform 126. Instead of Entropy Encoder 122, the decoder uses an Entropy Decoder 140 to decode the video bitstream into quantized transform coefficients and needed coding information (e.g., ILPF information, Intra prediction information and Inter prediction information) . The Intra prediction 150 at the decoder side does not need to perform the mode search. Instead, the decoder only needs to generate Intra prediction according to Intra prediction information received from the Entropy Decoder 140. Furthermore, for Inter prediction, the decoder only needs to perform motion compensation (MC 152) according to Inter prediction information received from the Entropy Decoder 140 without the need for motion estimation.

Intra Template Matching

Intra Template Matching Prediction (IntraTMP) is a special intra prediction mode that copies the best prediction block from the reconstructed part of the current frame, whose L-shaped template matches the current template. For a predefined search range, the encoder searches for the most similar template matched with the current template in a reconstructed part of the current frame and uses the corresponding block as a prediction block. The encoder then signals the usage of this mode, and the same prediction operation is performed at the decoder side.

The prediction signal is generated by matching the L-shaped causal neighbour of the current block with another block in a predefined search area in Fig. 2 consisting of:

R1: current CTU

R2: top-left CTU

R3: above CTU

R4: left CTU

In Fig. 2, the current block 210 in R1 is matched with the corresponding block 212 in R2. The templates for the current block and the matched block are shown as darker-colour L-shaped areas. Area 222 corresponds to reconstructed region in the current picture 220. Sum of absolute differences (SAD) is used as a cost function. Within each region, the decoder searches for the template that has least SAD with respect to the current one and uses its corresponding block as a prediction block.

The dimensions of all regions (SearchRange_w, SearchRange_h) are set proportional to the block dimension (BlkW, BlkH) to have a fixed number of SAD comparisons per pixel. That is:
SearchRange_w = a *BlkW,
SearchRange_h = a *BlkH.

Where ‘a’ is a constant that controls the gain/complexity trade-off. In practice, ‘a’ is equal to 5.

To speed-up the template matching process, the search range of all search regions is subsampled by a factor of 2. This leads to a reduction of template matching search by 4. After finding the best match, a refinement process is performed. The refinement is done via a second template matching search around the best match with a reduced range. The reduced range is defined as min (BlkW, BlkH) /2.

The Intra template matching tool is enabled for CUs with size less than or equal to 64 in width and height. This maximum CU size for Intra template matching is configurable.

The Intra template matching prediction mode is signalled at CU level through a dedicated flag when DIMD (Decoder-side Intra Mode Derivation) is not used for current CU.

IntraTMP Derived Block Vector Candidates for IBC

In this method, block vector (BV) derived from the intra template matching prediction (IntraTMP) is used for Intra Block Copy (IBC) , which will be described later. The stored IntraTMP BV of the neighbouring blocks along with IBC BV are used as spatial BV candidates in IBC candidate list construction.

IntraTMP block vector is stored in the IBC block vector buffer and, the current IBC block can use both IBC BV and IntraTMP BV of neighbouring blocks as BV candidates for IBC BV candidate list as shown in Fig. 3.

In Fig. 3, block 310 corresponds to the current block and block 312 corresponds to a neighbouring IntraTMP block. The IntraTMP BV 316 is used to locate the best matching block 322 according to the matching cost between template 324 and template 314. Area 322 corresponds to reconstructed region in the current picture 330. IntraTMP block vectors are added to IBC block vector candidate list as spatial candidates.

Current Picture Referencing

Motion Compensation, one of the key technologies in hybrid video coding, explores the pixel correlation between adjacent pictures. It is generally assumed that, in a video sequence, the patterns corresponding to objects or background in a frame are displaced to form corresponding objects in the subsequent frame or correlated with other patterns within the current frame. With the estimation of such displacement (e.g., using block matching techniques) , the pattern can be mostly reproduced without the need to re-code the pattern. Similarly, block matching and copy has also been tried to allow selecting the reference block from the same picture as the current block. It was observed to be inefficient when applying this concept to camera captured videos. Part of the reasons is that the textual pattern in a spatial neighbouring area may be similar to the current coding block, but usually with some gradual changes over the space. It is difficult for a block to find an exact match within the same picture in a video captured by a camera. Accordingly, the improvement in coding performance is limited.

However, the situation for spatial correlation among pixels within the same picture is different for screen contents. For a typical video with texts and graphics, there are usually repetitive patterns within the same picture. Hence, intra (picture) block compensation has been observed to be very effective. A new prediction mode, i.e., the intra block copy (IBC) mode or called current picture referencing (CPR) , has been introduced for screen content coding to utilize this characteristic. In the CPR mode, a prediction unit (PU) is predicted from a previously reconstructed block within the same picture. Further, a displacement vector (called block vector or BV) is used to indicate the relative displacement from the position of the current block to that of the reference block. The prediction errors are then coded using transformation, quantization and entropy coding. An example of CPR compensation is illustrated in Fig. 4, where block 412 is a corresponding block for block 410, and block 422 is a corresponding block for block 420. In this technique, the reference samples correspond to the reconstructed samples of the current decoded picture prior to in-loop filter operations, both deblocking and sample adaptive offset (SAO) filters in HEVC.

The very first version of CPR was proposed in JCTVC-M0350 (Budagavi et al., AHG8: Video coding using Intra motion compensation, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC 1/SC 29/WG11, 13th Meeting: Incheon, KR, 18–26 Apr. 2013, Document: JCTVC-M0350) to the HEVC Range Extensions (RExt) development. In this version, the CPR compensation was limited to be within a small local area, with only 1-D block vector and only for block size of 2Nx2N. Later, a more advanced CPR design has been developed during the standardization of HEVC SCC (Screen Content Coding) .

IBC Reference Region

To reduce memory consumption and decoder complexity, the IBC in VVC allows only the reconstructed portion of the predefined area including the region of current CTU and some region of the left CTU. Fig. 5 illustrates the reference region of IBC Mode, where each block represents 64x64 luma sample unit. Depending on the location of the current coded CU within the current CTU, the following applies:

– If the current block falls into the top-left 64x64 block of the current CTU (case 510 in Fig. 5) , then in addition to the already reconstructed samples in the current CTU, it can also refer to the reference samples in the bottom-right 64x64 blocks of the left CTU, using current picture referencing (CPR) mode. (More details of CPR can be found in JVET-T2002 (Jianle Chen, et. al., “Algorithm description for Versatile Video Coding and Test Model 11 (VTM 11) ” , Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 20th Meeting, by teleconference, 7 –16 October 2020, Document: JVET-T2002) . The current block can also refer to the reference samples in the bottom-left 64x64 block of the left CTU and the reference samples in the top-right 64x64 block of the left CTU, using CPR mode.

– If the current block falls into the top-right 64x64 block of the current CTU (case 520 in Fig. 5) , then in addition to the already reconstructed samples in the current CTU, if luma location (0, 64) relative to the current CTU has not yet been reconstructed, the current block can also refer to the reference samples in the bottom-left 64x64 block and bottom-right 64x64 block of the left CTU, using CPR mode; otherwise, the current block can also refer to reference samples in bottom-right 64x64 block of the left CTU.

– If the current block falls into the bottom-left 64x64 block of the current CTU (case 530 in Fig. 5) , then in addition to the already reconstructed samples in the current CTU, if luma location (64, 0) relative to the current CTU has not yet been reconstructed, the current block can also refer to the reference samples in the top-right 64x64 block and bottom-right 64x64 block of the left CTU, using CPR mode. Otherwise, the current block can also refer to the reference samples in the bottom-right 64x64 block of the left CTU, using CPR mode.

– If current block falls into the bottom-right 64x64 block of the current CTU (case 540 in Fig. 5) , it can only refer to the already reconstructed samples in the current CTU, using CPR mode.

This restriction allows the IBC mode to be implemented using local on-chip memory for hardware implementations.

IBC Interaction with Other Coding Tools

The interaction between IBC mode and other inter coding tools in VVC, such as pairwise merge candidate, history based motion vector predictor (HMVP) , combined intra/inter prediction mode (CIIP) , merge mode with motion vector difference (MMVD) , and geometric partitioning mode (GPM) are as follows:

- IBC can be used with pairwise merge candidate and HMVP. A new pairwise IBC merge candidate can be generated by averaging two IBC merge candidates. For HMVP, IBC motion is inserted into history buffer for future referencing.

- IBC cannot be used in combination with the following inter tools: affine motion, CIIP, MMVD, and GPM.

- IBC is not allowed for the chroma coding blocks when DUAL_TREE partition is used.

Unlike in the HEVC screen content coding extension, the current picture is no longer included as one of the reference pictures in the reference picture list 0 for IBC prediction. The derivation process of motion vectors for IBC mode excludes all neighbouring blocks in inter mode and vice versa. The following IBC design aspects are applied:

- IBC shares the same process as in regular MV merge including pairwise merge candidate and history-based motion predictor but disallows TMVP and zero vector because they are invalid for IBC mode.

- Separate HMVP buffer (5 candidates each) is used for conventional MV and IBC.

- Block vector constraints are implemented in the form of bitstream conformance constraint, the encoder needs to ensure that no invalid vectors are present in the bitstream, and merge shall not be used if the merge candidate is invalid (out of range or 0) . Such bitstream conformance constraint is expressed in terms of a virtual buffer as described below.

- For deblocking, IBC is handled as inter mode.

- If the current block is coded using IBC prediction mode, AMVR does not use quarter-pel; instead, AMVR is signalled to only indicate whether MV is inter-pel or 4 integer-pel.

- The number of IBC merge candidates can be signalled in the slice header separately from the numbers of regular, subblock, and geometric merge candidates.

A virtual buffer concept is used to describe the allowable reference region for IBC prediction mode and valid block vectors. Denote CTU size as ctbSize, the virtual buffer, ibcBuf, has width being wIbcBuf = 128x128/ctbSize and height hIbcBuf = ctbSize. For example, for a CTU size of 128x128, the size of ibcBuf is also 128x128; for a CTU size of 64x64, the size of ibcBuf is 256x64; and a CTU size of 32x32, the size of ibcBuf is 512x32.

The size of a VPDU (Virtual Pipeline Data Unit) is min (ctbSize, 64) in each dimension, W_v = min (ctbSize, 64) .

The virtual IBC buffer, ibcBuf is maintained as follows.

- At the beginning of decoding each CTU row, refresh the whole ibcBuf with an invalid value -1.

- At the beginning of decoding a VPDU (xVPDU, yVPDU) relative to the top-left corner of the picture, set the ibcBuf [x] [y] = -1, with x = xVPDU%wIbcBuf, …, xVPDU%wIbcBuf + W_v -1; y = yVPDU%ctbSize, …, yVPDU%ctbSize + W_v -1.

- After decoding a CU containing (x, y) relative to the top-left corner of the picture, set ibcBuf [x %wIbcBuf] [y %ctbSize] = recSample [x] [y]

For a block covering the coordinates (x, y) , if the following is true for a block vector bv = (bv [0] , bv [1] ) , then it is valid; otherwise, it is not valid:

ibcBuf [ (x + bv [0] ) %wIbcBuf] [ (y + bv [1] ) %ctbSize] shall not be equal to -1.

Position Dependent Intra Prediction Combination (PDPC)

In VVC, the results of intra prediction of DC, planar and several angular modes are further modified by a position dependent intra prediction combination (PDPC) method. PDPC is an intra prediction method which invokes a combination of the boundary reference samples and HEVC-style intra prediction with filtered boundary reference samples. PDPC is applied to the following intra modes without signalling: planar, DC, intra angles less than or equal to horizontal, and intra angles greater than or equal to vertical and less than or equal to 80. If the current block is BDPCM mode or MRL index is larger than 0, PDPC is not applied.

The prediction sample pred (x′, y′) is predicted using an intra prediction mode (e.g., DC, planar, angular) and a linear combination of reference samples according to the Equation 8 as follows:
pred (x′, y′) =Clip (0, (1＜＜BitDeph) -1, (wL×R_-1, y′+wT×R_x′, -1+
(64-wL-wT) ×pred (x′, y′) +32) ＞＞6)

where R_x, -1, R_-1, y represent the reference samples located at the top and left boundaries of current sample (x, y) , respectively.

If PDPC is applied to DC, planar, horizontal, and vertical intra modes, additional boundary filters are not needed, as required in the case of HEVC DC mode boundary filter or horizontal/vertical mode edge filters. PDPC process for DC and Planar modes is identical. For angular modes, if the current angular mode is HOR_IDX or VER_IDX, left or top reference samples is not used, respectively. The PDPC weights and scaling factors are dependent on prediction modes and the block sizes. PDPC is applied to the block with both width and height greater than or equal to 4.

Figs. 6A-D illustrate the definition of reference samples (R_x, -1 and R_-1, y) for PDPC applied over various prediction modes, where Fig. 6A corresponds to the diagonal top-right mode, Fig. 6B corresponds to the diagonal bottom-left mode, Fig. 6C corresponds to the adjacent diagonal top-right mode and Fig. 6D corresponds to the adjacent diagonal bottom-left mode. The prediction sample pred (x′, y′) is located at (x′, y′) within the prediction block. As an example, the coordinate x of the reference sample R_x, -1 is given by: x=x′+y′+1, and the coordinate y of the reference sample R_-1, y is similarly given by: y=x′+y′+1 for the diagonal modes. For the other angular mode, the reference samples R_x, -1and R_-1, ycould be located in fractional sample position. In this case, the sample value of the nearest integer sample location is used.

Gradient PDPC

In VVC, for a few scenarios, PDPC may not be applied due to the unavailability of the secondary reference samples. In these cases, a gradient based PDPC, extended from horizontal/vertical mode, is applied (as disclosed in JVET-Q0391 (Bappaditya Ray, et. al., “Unified PDPC for angular intra modes” , Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 17th Meeting: Brussels, BE, 7–17 January 2020, Document: JVET-Q0391) . The PDPC weights (wT /wL) and nScale parameter for determining the decay in PDPC weights with respect to the distance from left/top boundary are set equal to corresponding parameters in horizontal/vertical mode, respectively. When the secondary reference sample is at a fractional sample position, bilinear interpolation is applied.

JVET-O0537 Non-CE: Weighted Intra and Inter Prediction Mode

A new mode, namely Weighted Intra and Inter Prediction (WIIP) mode, is signalled together with CIIP mode. When CIIP flag is true, another flag, namely WIIP flag, is further signalled to indicate whether to use WIIP mode or not. When WIIP flag is true, WIIP mode is used. Otherwise (i.e., when WIIP flag being false) , original CIIP mode in VTM-5.0 is used. In addition, same block size restrictions of CIIP mode are also applied to WIIP mode. Two different alternatives are proposed as follows.

Method #1: The first alternative method is the same as that in JVET-N0395 (Liang Zhao, et. al., “CE10-related: Simplification and improvement of combined intra-inter prediction mode” , Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 14th Meeting: Geneva, CH, 19–27 March 2019, Document: JVET-N0395) , PDPC filtering process is directly applied to inter prediction samples to generate the final prediction sample values, the flowchart of the first method is shown in Fig. 7, where the prediction samples from inter prediction unit 710 along with intra mode data and reconstructed data are applied to PDPC filtering process 720.

Method #2: In the second alternative method, PDPC process does not depend on inter prediction data anymore. It is based on the neighbouring reconstructed samples, the same as CIIP. Then, the final prediction values of WIIP mode are calculated as a weighted average of PDPC prediction values and inter prediction values. To be specific, PDPC process in WIIP mode is shown below:
predPdpc (x, y) = (W_L × R_-1, y + W_T × R_x, -1 + 32) >> 6,

wherein W_L and W_T represent the weighting factors and R-1, y and Rx, -1 represent the left and above neighbouring reconstructed samples. W_L and W_T are the same as PDPC process. After that, the inter prediction values (predInter) and PDPC prediction (predPdpc) are combined to generate the final prediction values, as shown follows:
FinalPred (x, y) = ( (predPdpc (x, y) << 6) + (64 –W_L –W_T) × predInter (x, y) + 32) ) >> 6,
FinalPred (x, y) = (W_L × R_-1, y + W_T × R_x, -1 + (64 –W_L –W_T) × predInter (x, y) + 32) ) >> 6.

The flowchart of the second method is shown in Fig. 8, where the final predictor is generated based on PDPC prediction 820 and inter prediction 830 using a combiner 840. The input for PDPC prediction 820 is from intra prediction 810.

In addition, the same implicit transform scheme used in VTM-5.0 is applied for both proposed methods. To be specific, the implicit transform scheme used for WIIP mode is shown below:
trTypeHor = (nTbW >= 4 && nTbW <= 16) ? DST-7 : DCT-2,
trTypeVer = (nTbH >= 4 && nTbH <= 16) ? DST-7 : DCT-2.

IBC-LIC

In JVET-AC0112 (Yang Wang, et. al., “EE2-3.6: IBC-CIIP, IBC-GPM, and IBC-LIC” , Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 29th Meeting, by teleconference, 11–20 January 2023, Document: JVET-AC0112) , IBC-LIC is adopted in ECM (Enhanced Compression Model) . Intra block copy with local illumination compensation (IBC-LIC) is a coding tool which compensates the local illumination variation within a picture between the CU coded with IBC and its prediction block with a linear equation. The parameters of the linear equation are derived same as LIC for inter prediction except that the reference template is generated using block vector in IBC-LIC. IBC-LIC can be applied to IBC AMVP mode and IBC merge mode. IBC AMVP (IBC with Advanced Motion Vector Prediction Mode) corresponds to the IBC mode that directly signals motion vector predictor index, and motion vector difference if necessary. The name IBC AMVP is used in this disclosure. For IBC AMVP mode, an IBC-LIC flag is signalled to indicate the use of IBC-LIC. For IBC merge mode, the IBC-LIC flag is inferred from the merge candidate.

IBC-GPM

In JVET-AC0112, IBC-GPM is adopted in ECM. Intra block copy with geometry partitioning mode (IBC-GPM) is a coding tool which divides a CU into two sub-partitions geometrically. The prediction signals of the two sub-partitions are generated using IBC and intra prediction. IBC-GPM can be applied to regular IBC merge mode or IBC TM merge mode. An intra prediction mode (IPM) candidate list is constructed using the same method as GPM with inter and intra prediction for intra prediction, and the IPM candidate list size is pre-defined as 3. There are 48 geometry partitioning modes in total, which are divided into two geometry partitioning mode sets as follows:

Table 1: Geometry partitioning modes in the first geometry partitioning mode set

Table 2: Geometry partitioning modes in the second geometry partitioning mode set

When IBC-GPM is used, an IBC-GPM geometry partitioning mode set flag is signalled to indicate whether the first or the second geometry partitioning mode set is selected, followed by the geometry partitioning mode index. An IBC-GPM intra flag is signalled to indicate whether intra prediction is used for the first sub-partition. When intra prediction is used for a sub-partition, an intra prediction mode index is signalled. When IBC is used for a sub-partition, a merge index is signalled.

IBC-CIIP

In JVET-AC0112, IBC-CIIP is adopted in ECM. Combined intra block copy and intra prediction (IBC-CIIP) is a coding tool for a CU which uses IBC with merge mode and intra prediction to obtain two prediction signals, and the two prediction signals are weighted summed to generate the final prediction. Specifically, if the intra prediction is planar or DC mode, the final prediction is obtained as follows:
P= (w_ibc*P_ibc+ ( (1＜＜shift) -w_ibc) *P_intra+ (1＜＜ (shift-1) ) ) ＞＞shift,

wherein P_ibc and P_intra denote the IBC prediction signal and intra prediction signal, respectively. (w_ibc, shift) are set equal to (13, 4) and (1, 1) for IBC merge mode and IBC AMVP mode.

Adaptive Reordering of Merge Candidates with Template Matching (ARMC-TM)

The merge candidates are adaptively reordered with template matching (TM) . The reordering method is applied to regular merge mode, TM merge mode, and affine merge mode (excluding the SbTMVP (Subblock-based Temporal Motion Vector Prediction) candidate) . For the TM merge mode, merge candidates are reordered before the refinement process.

An initial merge candidate list is firstly constructed according to a given checking order, such as spatial, TMVPs, non-adjacent, HMVPs, pairwise, virtual merge candidates. Then the candidates in the initial list are divided into multiple subgroups. For the template matching (TM) merge mode, adaptive DMVR mode, each merge candidate in the initial list is firstly refined by using TM/multi-pass DMVR. Merge candidates in each subgroup are reordered to generate a reordered merge candidate list and the reordering is according to cost values based on template matching. The index of selected merge candidate in the reordered merge candidate list is signalled to the decoder. For simplification, merge candidates in the last but not the first subgroup are not reordered. All the zero candidates from the ARMC reordering process are excluded during the construction of Merge motion vector candidates list. The subgroup size is set to 5 for regular merge mode and TM merge mode. The subgroup size is set to 3 for affine merge mode.

Cost calculation

The template matching cost of a merge candidate during the reordering process is measured by the SAD between samples of a template of the current block and their corresponding reference samples. The template comprises a set of reconstructed samples neighbouring to the current block. Reference samples of the template are located by the motion information of the merge candidate. When a merge candidate utilizes bi-directional prediction, the reference samples of the template of the merge candidate are also generated by bi-prediction as shown in Fig. 9.

In Fig. 9, block 912 corresponds to a current block in current picture 910, blocks 922 and 932 correspond to reference blocks in reference pictures 920 and 930 in list 0 and list 1 respectively. Templates 914 and 916 are for current block 912, templates 924 and 926 are for reference block 922, and templates 934 and 936 are for reference block 932. Motion vectors 940, 942 and 944 are the list 0 motion vector of the merge candidate and motion vectors 950, 952 and 954 are the list 1 motion vector of the merge candidate.

Refinement of the initial merge candidate list

When multi-pass DMVR is used to derive the refined motion to the initial merge candidate list only the first pass (i.e., PU level) of multi-pass DMVR is applied in reordering. When template matching is used to derive the refined motion, the template size is set equal to 1. Only the above or left template is used during the motion refinement of TM when the block is flat with block width greater than 2 times of height or narrow with height greater than 2 times of width. TM is extended to perform 1/16-pel MVD precision. The first four merge candidates are reordered with the refined motion in TM merge mode.

Reordering criterial

In the reordering process, a candidate is considered as redundant if the cost difference between a candidate and its predecessor is inferior to a lambda value e.g., |D1-D2| < λ, where D1 and D2 are the costs obtained during the first ARMC ordering and λ is the Lagrangian parameter used in the RD criterion at encoder side.

The algorithm according to ARMC-TM is defined as the following:

- Determine the minimum cost difference between a candidate and its predecessor among all candidates in the list.

○ If the minimum cost difference is superior or equal to λ, the list is considered diverse enough and the reordering stops.

○ If this minimum cost difference is inferior to λ, the candidate is considered as redundant, and it is moved at a further position in the list. This further position is the first position where the candidate is diverse enough compared to its predecessor.

- The algorithm stops after a finite number of iterations (if the minimum cost difference is not inferior to λ) .

This algorithm is applied to the Regular, TM, BM and Affine merge modes. A similar algorithm is applied to the Merge MMVD and sign MVD prediction methods which also use ARMC for the reordering.

The value of λ is set equal to the λ of the rate distortion criterion used to select the best merge candidate at the encoder side for low delay configuration and to the value λcorresponding to another QP for Random Access configuration. A set of λ values corresponding to each signalled QP offset is provided in the SPS or in the Slice Header for the QP offsets which are not present in the SPS.

Extension to AMVP modes

The ARMC design is also applicable to the AMVP mode wherein the AMVP candidates are reordered according to the TM cost. For the template matching for advanced motion vector prediction (TM-AMVP) mode, an initial AMVP candidate list is constructed, followed by a refinement from TM to construct a refined AMVP candidate list. In addition, an MVP candidate with a TM cost larger than a threshold, which is equal to five times of the cost of the first MVP candidate, is skipped.

Note, when wrap around motion compensation is enabled, the MV candidate shall be clipped with wrap around offset taken into consideration.

Multiple Transform Selection (MTS) for Core Transform

In addition to DCT-II which has been employed in HEVC, a Multiple Transform Selection (MTS) scheme is used for residual coding both inter and intra coded blocks. It uses multiple selected transforms from the DCT8/DST7. The newly introduced transform matrices are DST-VII and DCT-VIII. Table 3 shows the basis functions of the selected DST/DCT.

Table 3. Transform basis functions of DCT-II/VIII and DSTVII for N-point input

In order to keep the orthogonality of the transform matrix, the transform matrices are quantized more accurately than the transform matrices in HEVC. To keep the intermediate values of the transformed coefficients within the 16-bit range, after horizontal and after vertical transform, all the coefficients are to have 10-bit accuracy.

In order to control MTS scheme, separate enabling flags are specified at SPS level for intra and inter, respectively. When MTS is enabled at SPS, a CU level flag is signalled to indicate whether MTS is applied or not. Here, MTS is applied only for luma. The MTS signalling is skipped when one of the below conditions is applied:

- The position of the last significant coefficient for the luma TB is less than 1 (i.e., DC only) ,

- The last significant coefficient of the luma TB is located inside the MTS zero-out region.

If MTS CU flag is equal to zero, then DCT2 is applied in both directions. However, if MTS CU flag is equal to one, then two other flags are additionally signalled to indicate the transform type for the horizontal and vertical directions, respectively. Transform and signalling mapping table as shown in Table 4. Unified the transform selection for ISP and implicit MTS is used by removing the intra-mode and block-shape dependencies. If the current block is ISP mode or if the current block is intra block and both intra and inter explicit MTS is on, then only DST7 is used for both horizontal and vertical transform cores. When it comes to transform matrix precision, 8-bit primary transform cores are used. Therefore, all the transform cores used in HEVC are kept as the same, including 4-point DCT-2 and DST-7, 8-point, 16-point and 32-point DCT-2. Also, other transform cores including 64-point DCT-2, 4-point DCT-8, 8-point, 16-point, 32-point DST-7 and DCT-8, use 8-bit primary transform cores.

Table 4. Transform and signalling mapping table

To reduce the complexity of large size DST-7 and DCT-8, high frequency transform coefficients are zeroed out for the DST-7 and DCT-8 blocks with size (width or height, or both width and height) equal to 32. Only the coefficients within the 16x16 lower-frequency region are retained.

As in HEVC, the residual of a block can be coded with transform skip mode. To avoid the redundancy of syntax coding, the transform skip flag is not signalled when the CU level MTS_CU_flag is not equal to zero. Note that implicit MTS transform is set to DCT2 when LFNST or MIP is activated for the current CU. Also, the implicit MTS can be still enabled when MTS is enabled for inter coded blocks.

Low-frequency non-separable transform (LFNST)

In VVC, LFNST is applied between forward primary transform and quantization (at the encoder side) and between de-quantization and inverse primary transform (at the decoder side) as shown in Fig. 10. As shown in Fig. 10, after Forward Primary Transform 1010, Forward Low-Frequency Non-Separable Transform LFNST 1020 is applied to top-left region 1022 of the Forward Primary Transform output. In LFNST, 4x4 non-separable transform or 8x8 non-separable transform is applied according to block size. For example, 4x4 LFNST is applied for small blocks (i.e., min (width, height) < 8) and 8x8 LFNST is applied for larger blocks (i.e., min (width, height) > 4) . After LFNST, the transform coefficients are quantized by Quantization 1030. To reconstruct the input signal, the quantized transform coefficients are de-quantized using De-Quantization 1040 to obtain the de-quantized transform coefficients. Inverse LFNST 1050 is applied to the top-left region (8 coefficients for 4x4 inverse LFNST or 16 coefficients for 8x8 inverse LFNST) . After invers LFNST, inverse Primary Transform 1060 is applied to recover the input signal.

Application of a non-separable transform, which is being used in LFNST, is described as follows using input as an example. To apply 4x4 LFNST, the 4x4 input block X

is first represented as a vector

The non-separable transform is calculated aswhereindicates the transform coefficient vector, and T is a 16x16 transform matrix. The 16x1 coefficient vectoris subsequently re-organized as 4x4 block using the scanning order for that block (horizontal, vertical or diagonal) . The coefficients with smaller index will be placed with the smaller scanning index in the 4x4 coefficient block.

Reduced Non-Separable Transform

LFNST (low-frequency non-separable transform) is based on direct matrix multiplication approach to apply non-separable transform so that it is implemented in a single pass without multiple iterations. However, the non-separable transform matrix dimension needs to be reduced to minimize computational complexity and memory space to store the transform coefficients. Hence, reduced non-separable transform (or RST) method is used in LFNST. The main idea of the reduced non-separable transform is to map an N (N is commonly equal to 64 for 8x8 NSST) dimensional vector to an R dimensional vector in a different space, where N/R (R < N) is the reduction factor. Hence, instead of NxN matrix, RST matrix becomes an R×N matrix as follows:

where the R rows of the transform are R bases of the N dimensional space.

The inverse transform matrix for RT is the transpose of its forward transform. For 8x8 LFNST, a reduction factor of 4 is applied, and 64x64 direct matrix, which is conventional 8x8 non-separable transform matrix size, is reduced to16x48 direct matrix. Hence, the 48×16 inverse RST matrix is used at the decoder side to generate core (primary) transform coefficients in 8×8 top-left regions. When16x48 matrices are applied instead of 16x64 with the same transform set configuration, each of which takes 48 input data from three 4x4 blocks in a top-left 8x8 block excluding right-bottom 4x4 block. With the help of the reduced dimension, memory usage for storing all LFNST matrices is reduced from 10KB to 8KB with reasonable performance drop. In order to reduce complexity LFNST is restricted to be applicable only if all coefficients outside the first coefficient sub-group are non-significant. Hence, all primary-only transform coefficients have to be zero when LFNST is applied. This allows a conditioning of the LFNST index signalling on the last-significant position, and hence avoids the extra coefficient scanning in the current LFNST design, which is needed for checking for significant coefficients at specific positions only. The worst-case handling of LFNST (in terms of multiplications per pixel) restricts the non-separable transforms for 4x4 and 8x8 blocks to 8x16 and 8x48 transforms, respectively. In those cases, the last-significant scan position has to be less than 8 when LFNST is applied, for other sizes less than 16. For blocks with a shape of 4xN and Nx4 and N > 8, the proposed restriction implies that the LFNST is now applied only once, and that to the top-left 4x4 region only. As all primary-only coefficients are zero when LFNST is applied, the number of operations needed for the primary transforms is reduced in such cases. From encoder perspective, the quantization of coefficients is remarkably simplified when LFNST transforms are tested. A rate-distortion optimized quantization has to be done at maximum for the first 16 coefficients (in scan order) , the remaining coefficients are enforced to be zero.

LFNST Transform Selection

There are totally 4 transform sets and 2 non-separable transform matrices (kernels) per transform set are used in LFNST. The mapping from the intra prediction mode to the transform set is pre-defined as shown in Table 5. If one of three CCLM modes (INTRA_LT_CCLM, INTRA_T_CCLM or INTRA_L_CCLM) is used for the current block (81 <= predModeIntra <= 83) , transform set 0 is selected for the current chroma block. For each transform set, the selected non-separable secondary transform candidate is further specified by the explicitly signalled LFNST index. The index is signalled in a bit-stream once per Intra CU after transform coefficients.

Table 5. Transform selection table

LFNST Index Signalling and Interaction with Other Tools

Since LFNST is restricted to be applicable only if all coefficients outside the first coefficient sub-group are non-significant, LFNST index coding depends on the position of the last significant coefficient. In addition, the LFNST index is context coded but does not depend on intra prediction mode, and only the first bin is context coded. Furthermore, LFNST is applied for intra CU in both intra and inter slices, and for both luma and chroma. If a dual tree is enabled, LFNST indices for luma and chroma are signalled separately. For inter slice (the dual tree is disabled) , a single LFNST index is signalled and used for both luma and chroma.

Considering that a large CU greater than 64x64 is implicitly split (TU tiling) due to the existing maximum transform size restriction (64x64) , an LFNST index search could increase data buffering by four times for a certain number of decode pipeline stages. Therefore, the maximum size that LFNST allowed is restricted to 64x64. Note that LFNST is enabled with DCT2 only. The LFNST index signalling is placed before MTS index signalling.

The use of scaling matrices for perceptual quantization is not evident that the scaling matrices that are specified for the primary matrices may be useful for LFNST coefficients. Hence, the uses of the scaling matrices for LFNST coefficients are not allowed. For single-tree partition mode, chroma LFNST is not applied.

Subblock Transform (SBT)

In VTM (VVC Test Model) software, subblock transform is introduced for an inter-predicted CU. In this transform mode, only a sub-part of the residual block is coded for the CU. When inter-predicted CU with cu_cbf equal to 1, cu_sbt_flag may be signalled to indicate whether the whole residual block or a sub-part of the residual block is coded. In the former case, inter MTS information is further parsed to determine the transform type of the CU. In the latter case, a part of the residual block is coded with inferred adaptive transform and the other part of the residual block is zeroed out.

When SBT is used for an inter-coded CU, SBT type and SBT position information are signalled in the bitstream. There are two SBT types and two SBT positions. For SBT-V (or SBT-H) , the TU width (or height) may equal to half of the CU width (or height) or 1/4 of the CU width (or height) , resulting in 2: 2 split or 1: 3/3: 1 split. The 2: 2 split is like a binary tree (BT) split while the 1: 3/3: 1 split is like an asymmetric binary tree (ABT) split. In ABT splitting, only the small region contains the non-zero residual. If one dimension of a CU is 8 in luma samples, the 1: 3/3: 1 split along that dimension is not allowed. There are at most 8 SBT modes for a CU.

Position-dependent transform core selection is applied on luma transform blocks in SBT-V and SBT-H (chroma TB always using DCT-2) . The two positions of SBT-H (1130 and 1140) and SBT-V (1110 and 1120) are associated with different core transforms as shown in Fig. 11. More specifically, the horizontal and vertical transforms for each SBT position is specified in Fig. 11. For example, the horizontal and vertical transforms for SBT-V position 0 is DCT-8 and DST-7, respectively. When one side of the residual TU is greater than 32, the transform for both dimensions is set as DCT-2. Therefore, the subblock transform jointly specifies the TU tiling, cbf, and horizontal and vertical core transform type of a residual block.

The SBT is not applied to the CU coded with combined inter-intra mode.

Block Differential Pulse Coded Modulation (BDPCM)

VVC supports block differential pulse coded modulation (BDPCM) for screen content coding. At the sequence level, a BDPCM enable flag is signalled in the SPS; this flag is signalled only if the transform skip mode (described in the next section) is enabled in the SPS.

When BDPCM is enabled, a flag is transmitted at the CU level if the CU size is smaller than or equal to MaxTsSize by MaxTsSize in terms of luma samples and if the CU is intra coded, where MaxTsSize is the maximum block size for which the transform skip mode is allowed. This flag indicates whether regular intra coding or BDPCM is used. If BDPCM is used, a BDPCM prediction direction flag is transmitted to indicate whether the prediction is horizontal or vertical. Then, the block is predicted using the regular horizontal or vertical intra prediction process with unfiltered reference samples. The residual is quantized and the difference between each quantized residual and its predictor (i.e. the previously coded residual of the horizontal or vertical (depending on the BDPCM prediction direction) neighbouring position) is coded.

For a block of size M (height) × N (width) , let r_i, j, 0≤i≤M-1, 0≤j≤N-1, r_i, j, 0≤i≤M-1, 0≤j≤N-1 be the prediction residual. Let Q (r_i, j) , 0≤i≤M-1, 0≤j≤N-1Q (r_i, j) , 0≤i≤M-1, 0≤j≤N-1, denote the quantized version of the residual r_i, jr_i, j. BDPCM is applied to the quantized residual values, resulting in a modified M × N arraywith elementswhereis predicted from its neighbouring quantized residual value. For vertical BDPCM prediction mode, for 0≤j≤N-1 0≤j≤ (N-1) , the following is used to derive

For horizontal BDPCM prediction mode, for 0≤i≤ (M-1) 0≤i≤ (M-1) , the following is used to derive

At the decoder side, the above process is reversed to compute Q (r_i, j) Q (r_i, j) , 0≤i≤M-1, 0≤j≤N-1, as follows:

if vertical BDPCM is used

if horizontal BDPCM is used

The inverse quantized residuals, Q^-1 (Q (r_i, j) ) Q^-1 (Q (r_i, j) ) , are added to the intra block prediction values to produce the reconstructed sample values.

The predicted quantized residual valuesare sent to the decoder using the same residual coding process as that in transform skip mode residual coding. For lossless coding, if slice_ts_residual_coding_disabled_flag is set to 1, the quantized residual values are sent to the decoder using regular transform residual coding. In terms of the MPM mode for future intra mode coding, horizontal or vertical prediction mode is stored for a BDPCM-coded CU if the BDPCM prediction direction is horizontal or vertical, respectively. For deblocking, if both blocks on the sides of a block boundary are coded using BDPCM, then that particular block boundary is not deblocked.

Joint Coding of Chroma Residuals (JCCR)

VVC (Adrian Browne, et al., “Algorithm description for Versatile Video Coding and Test Model 14 (VTM 14) ” , Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 23rd Meeting, by teleconference, 7–16 July 2021, Document: W2002) supports the joint coding of chroma residual (JCCR) tool (section 3.5.7 of JVET-W2002) where the chroma residuals for colour components are coded jointly. The usage (i.e., activation) of the JCCR mode is indicated by a TU-level flag tu_joint_cbcr_residual_flag and the selected mode is implicitly indicated by the chroma CBFs (i.e., coded block flags) . The flag tu_joint_cbcr_residual_flag is present if either or both chroma CBFs for a TU are equal to 1. In the PPS (Picture Parameter Set) and slice header, chroma QP (quantization parameter) offset values are signalled for the JCCR mode to differentiate from the usual chroma QP offset values signalled for the regular chroma residual coding mode. These chroma QP offset values are used to derive the chroma QP values for some blocks coded by the JCCR mode. The JCCR mode has 3 sub-modes as shown in Table 6, where the reconstruction process of the chroma residuals (resCb and resCr) from the transmitted transform blocks is shown. When the JCCR mode is activated, one single joint chroma residual block (resJointC [x] [y] in Table 3 11) is signalled, and residual block for Cb (resCb) and residual block for Cr (resCr) are derived considering information such as tu_cbf_cb, tu_cbf_cr, and CSign, which is a sign value specified in the slice header.

Table 6. Three sub-modes of the JCCR mode

Non-separable Primary Transform (NSPT) for Intra Coding

The separable DCT-II plus LFNST transform combinations are replaced with NSPT for the block shapes 4x4, 4x8, 8x4 and 8x8, 4x16, 16x4, 8x16 and 16x8.

The affected block sizes are summarized in Fig. 12.

All NSPTs consist of 35 sets and 3 candidates (similar to the current LFNST) . The kernels of NSPTs have the following shapes:

· NSPT4x4: 16x16

· NSPT4x8/NSPT8x4: 32x20

· NSPT8x8: 64x32

· NSPT4x16/NSPT16x4: 64x24

· NSPT8x16/NSPT16x8: 128x40

Therefore, 12, 32, 40 and 88 coefficients are zeroed-out using NSPT4x8/NSPT8x4, NSPT8x8, NSPT4x16/NSPT16x4 and NSPT8x16/NSPT16x8 respectively.

In the present invention, methods to improve coding performance of video coding incorporating IBC coding tool.

BRIEF SUMMARY OF THE INVENTION

A method and apparatus for video coding are disclosed. According to the method, input data associated with a current block are received, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side. The current block is partitioned into a first sub-partition and a second sub-partition. First prediction is derived for the first sub-partition, wherein the first prediction comprises an IBC-based (Intra Block Copy based) predictor. Second prediction is derived for the second sub-partition, wherein the second prediction comprises an inter predictor. The first sub-partition and the second sub-partition of the current block are encoded or decoded using information comprising the first prediction and the second prediction respectively.

In one embodiment, the IBC-based predictor corresponds to an IBC AMVP (Adaptive Motion Vector Prediction) predictor or an IBC Merge predictor. In one embodiment, the inter predictor corresponds to a regular merge predictor, MMVD (Merge with Motion Vector Difference) predictor, TM merge predictor, BM (Block Matching) merge predictor, CIIP (Combined Inter Merge/Intra Prediction) predictor, Affine merge predictor, AMVP predictor, or SbtMVP (Subblock-based Temporal Motion Vector Prediction) predictor.

In one embodiment, a joint IBC and regular merge candidate list is constructed by including at least one IBC candidate and at least one regular merge candidate, and at least one target merge candidate is selected from the joint IBC and regular merge candidate list for one of the first sub-partition and the second sub-partition. In one embodiment, a second target candidate from an intra prediction mode candidate list or a TM (Template Matching) merge candidate list for another of the first sub-partition and the second sub-partition.

According to another method, input data associated with a current block are received, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side. A prediction candidate list is derived, wherein the prediction candidate list comprises one or more IBC-based (Intra Block Copy based) prediction candidates and one or more intra prediction candidates. The current block is encoded or decoded in an IBC mode or an intra prediction mode using information comprising the prediction candidate list.

In one embodiment, when the current block is coded in the IBC mode, a target candidate is selected from the prediction candidate list without a need for signalling or parsing a flag indicating the IBC mode. In one embodiment, said one or more IBC-based prediction candidates comprise one or more IBC spatial candidates, one or more non-adjacent candidates, one or more HMVP (History-based Moton Vector Prediction) candidates, or a combination thereof. In one embodiment, member candidates in the prediction candidate list are reordered according to TM (Template Matching) cost and a subset of the member candidates with lowest TM costs are kept in the prediction candidate list. In one embodiment, when the current block is coded in the intra prediction mode, the current block is allowed to reference one or more previously coded blocks in the IBC mode.

According to yet another method, input data associated with a current block are received, wherein the input data comprise pixel data or residual data to be encoded at an encoder side or transformed data associated with the current block to be decoded at a decoder side, and the current block is coded in an IBC (Intra Block Copy) mode. A transform syntax is signalled or parsed in a same way as an inter-coded block, wherein the transform syntax is associated with transform information for the current block. Forward transform or inverse transform is applied to the current block using the transform information.

In one embodiment, the transform syntax corresponds to available MTS (Multiple Transform Selection) candidates and the available MTS candidates are the same for an IBC coded block and the inter-coded block. In one embodiment, the transform syntax corresponds to a number of MTS (Multiple Transform Selection) candidates or size constraint for MTS, and the number of MTS (Multiple Transform Selection) candidates or the size constraint for MTS is the same for an IBC coded block and the inter-coded block.

In one embodiment, the transform syntax corresponds to a pre-defined transform mode. In one embodiment, the pre-defined transform mode corresponds to MTS (Multiple Transform Selection) , SBT (Subblock Transform) , secondary transform, JCCR (Joint coding of chrominance residuals) , NSPT (Non-separable Primary Transform) , or a combination thereof.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1A illustrates an exemplary adaptive Inter/Intra video coding system

incorporating loop processing.

Fig. 1B illustrates a corresponding decoder for the encoder in Fig. 1A.

Fig. 2 illustrates an example of search area used for Intra template matching.

Fig. 3 illustrates an example of use of IntraTMP block vector for IBC block.

Fig. 4 illustrates an example of CPR (Current Picture Referencing) compensation, where blocks are predicted by corresponding blocks in the same picture.

Fig. 5 illustrates the reference region of IBC Mode, where each block represents 64x64 luma sample unit and the reference region depends on the location of the current coded CU.

Figs. 6A-D illustrate examples of the definition of reference samples for PDPC applied over various prediction modes, where Fig. 6A corresponds to the diagonal top-right mode, Fig. 6B corresponds to the diagonal bottom-left mode, Fig. 6C corresponds to the adjacent diagonal top-right mode and Fig. 6D corresponds to the adjacent diagonal bottom-left mode.

Fig. 7 illustrates an exemplary flowchart of the first alternative method for WIIP mode.

Fig. 8 illustrates an exemplary flowchart of the second method for WIIP mode.

Fig. 9 illustrates an example of templates used for the current block and corresponding reference blocks to measure matching costs associated with merge candidates.

Fig. 10 illustrates an example of Low-Frequency Non-Separable Transform (LFNST) process.

Fig. 11 shows some examples of TT split forbidden when either width or height of a luma coding block is larger than 64.

Fig. 12 summarizes the affected block sizes, where the separable DCT-II plus LFNST transform combinations are replaced with NSPT for the block shapes in NSPT for intra coding.

Fig. 13 illustrates an example of IBC blending mode.

Fig. 14 illustrates the signalling of IBC related syntaxes according to VVC standard.

Fig. 15 illustrates the signalling of IBC related syntaxes according to an embodiment of the present invention.

Fig. 16 illustrates a flowchart of an exemplary video coding system that generates a final predictor of IBC-GPM mode by combining one IBC AMVP/Merge predictor and one inter predictor according to an embodiment of the present invention.

Fig. 17 illustrates a flowchart of an exemplary video coding system that uses a joint IBC and Intra candidate list according to an embodiment of the present invention.

Fig. 18 illustrates a flowchart of an exemplary video coding system that signals the transform syntax for an IBC coded block by treating IBC modes as inter sub-modesaccording to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

It will be readily understood that the components of the present invention, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the systems and methods of the present invention, as represented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. References throughout this specification to “one embodiment, ” “an embodiment, ” or similar language mean that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, etc. In other instances, well-known structures, or operations are not shown or described in detail to avoid obscuring aspects of the invention. The illustrated embodiments of the invention will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of apparatus and methods that are consistent with the invention as claimed herein.

Conventionally, PDPC will only apply to intra predictor, refining the intra-predictor boundary according to intra-prediction direction. A new method of PDPC in screen content video coding is disclosed in the present application. In the proposed method, PDPC or block boundary refinement mode can be applied to blocks coded with screen content coding modes, such as IBC mode or IntraTMP mode.

Intra Prediction Mode Derivation in IBC Mode and IntraTMP Mode

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, derivation of intra-prediction mode for the block can be from BV in IBC mode, the angle of motion shift derived from IntraTMP mode, decoder-side intra-mode derivation method or neighbour intra-prediction modes of a neighbouring block. Decoder-side intra mode derivation can be, but not limited to, DIMD or TIMD (template-based intra-mode derivation) . After derivation of intra-prediction mode for blocks coded with IBC mode or IntraTMP mode, the PDPC process can be applied according to the derived intra-prediction mode.

In one embodiment, for blocks coded with IBC mode, the intra prediction mode used in PDPC process is derived from the slope of BV. The slope of BV is mapped to the closet intra-prediction angle.

In another embodiment, for blocks coded with IntraTMP mode, the intra prediction mode used in PDPC process is derived from slope of motion shift derived in template matching. The slope of motion shift is mapped to the closet intra-prediction angle.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, the intra prediction mode used in PDPC process is derived from DIMD, TIMD or some decoder-side derived intra prediction modes. The derived intra-prediction mode is utilized in PDPC process.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, the intra prediction mode used in PDPC process is derived from neighbour intra-prediction modes of a neighbouring block. If there is no intra-predicted neighbouring blocks, a default mode can be used in PDPC process, such as planar mode or DC mode.

In another embodiment, the derived intra-prediction mode can be the same mode or the neighbouring modes of intra-prediction modes, such as intra-prediction modes -1, intra-prediction modes +1, intra-prediction modes -2, intra-prediction modes +2 and so on.

CIIP-PDPC in IBC Mode and IntraTMP Mode

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, CIIP-PDPC like predictor refinement process is performed at the current predictor. Neighbouring reconstructed samples are used to refine the current IBC predictor or IntraTMP predictor, and the blending lines or the blending weightings can be similar to planar mode or DC mode in PDPC process.

In one embodiment, when the current block is coded with IBC mode, neighbouring reconstructed samples around IBC predictor are used to perform PDPC, following the blending rules like PDPC in planar mode or PDPC in DC mode, regardless of BV direction.

In another embodiment, when the current block is coded with IntraTMP mode, neighbouring reconstructed samples around the predictor derived from template matching is used to performed PDPC, following the blending rules like PDPC in planar mode or PDPC in DC mode regardless of motion shift direction.

Adaptive Weighting of PDPC in IBC mode and IntraTMP mode

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, blending lines or blending weightings can be increased or decreased according to some conditions, such as difference between the current block and the predictor, gradients in the current block or gradients in the predictor, block width, block height, block area or prediction modes of neighbouring blocks. The blending weightings can contain neighbouring reconstruction and the predictor, or only contain the predictor. The blending lines can be zero line at the predictor or multiple lines at the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode and when the difference between the current block and the predictor is larger or smaller than a threshold, blending weightings in PDPC are increased.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode and when the difference between the current block and the predictor is larger or smaller than a threshold, blending weightings in PDPC are increased. The threshold can be signalled or pre-defined. The increased weightings may contain neighbouring reconstruction and the predictor, or only contain the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode and when the difference between the current block and the predictor is larger or smaller than a threshold, blending weightings in PDPC are decreased. The threshold can be signalled or pre-defined. The decreased weightings may contain neighbouring reconstruction and the predictor, or only contain the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode and when the difference between the current block and the predictor is larger or smaller than a threshold, blending lines in PDPC are increased. The threshold can be signalled or pre-defined. The increased blending lines can be zero line at the predictor or multiple lines at the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode and when the difference between the current block and the predictor is larger or smaller than a threshold, blending weightings in PDPC are decreased. The threshold can be signalled or pre-defined. The decreased blending lines can be zero line at the predictor or multiple lines at the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode and when SAD or SSE or SATD between the current block and the predictor is larger or smaller than a threshold, blending weightings in PDPC are increased. The threshold can be signalled or pre-defined. The increased weightings may contain neighbouring reconstruction and the predictor, or only contain the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, when SAD or SSE or SATD between the current block and the predictor is larger or smaller than a threshold, blending weightings in PDPC are decreased. The threshold can be signalled or pre-defined. The decreased weightings may contain neighbouring reconstruction and the predictor, or only contain the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, when SAD or SSE or SATD between the current block and the predictor is larger or smaller than a threshold, blending lines in PDPC are increased. The threshold can be signalled or pre-defined. The increased blending lines can be zero lines at the predictor or multiple lines at the predictor.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, when SAD or SSE or SATD between the current block and the predictor is larger or smaller than a threshold, blending weightings in PDPC are decreased. The threshold can be signalled or pre-defined. The decreased blending lines can be zero lines at the predictor or multiple lines at the predictor.

PDPC in IBC Mode and IntraTMP Mode with Multiple Reference Lines (MRL)

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, PDPC with multiple-reference lines is applied. Multiple reference lines around the predictor are considered in PDPC. PDPC may utilize one or more of the multiple reference lines in blending process. The blending weighting in different reference lines can be the same or be different. For example, the blending weighting can be in a descending order or ascending order from inner to outer lines. The blending lines can be the same as original PDPC in intra-prediction modes, or different blending lines from original PDPC in intra-prediction modes, or some fixed order according to multiple reference line order (for example: the most outer predictor line uses the most inner reference line, the second outer predictor line uses the second inner reference line and so on) .

In one embodiment, when the current block is coded with IBC mode or IntraTMP mode, multiple reference lines are considered during PDPC process and the blending weightings of each reference line can be different or the same. The blending weightings can be zero or non-zero.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, multiple reference lines are considered during PDPC process and the blending lines at the predictor can be multiple lines or single line or zero line.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, multiple reference lines are considered during PDPC process, and the blending weightings of each reference line can be in an ascending order or descending order.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, multiple reference lines are considered during PDPC process, and the blending order of each blending line corresponds to the reference line order. For example, the most outer predictor line uses the most inner reference line, the second outer predictor line uses the second inner reference line and so on.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, multiple reference lines are considered during PDPC process, and the blending order of each blending line corresponds to the reference line order. For example, the most outer predictor line uses the most outer reference line, the second outer predictor line uses the second outer reference line and so on.

In another embodiment, when the current block is coded with IBC mode or IntraTMP mode, multiple reference lines are considered during PDPC process, and the blending combination of reference lines and blending lines are in a pre-defined order.

PDPC in Reconstruction-Reordered Mode (RR-IBC Mode and RR- IntraTMP)

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, when reconstruction-reordered mode is enabled, PDPC may be applied according to the reordered reconstructed samples or directly apply without reconstructed samples reordered.

In one embodiment, for blocks coded with IBC mode or IntraTMP mode, when the reconstructed samples are horizontally flipped, PDPC is applied to the top boundary and right boundary of predictor according to the reordered reconstructed samples.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the reconstructed samples are horizontally flipped, PDPC is applied to the top boundary and left boundary of the predictor according to the reordered reconstructed samples.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the reconstructed samples are horizontally flipped, PDPC is applied to the top boundary and left boundary of the predictor without considering reordered reconstructed samples.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the reconstructed samples are horizontally flipped, PDPC is applied to the top boundary and right boundary of the predictor without considering reordered reconstructed samples.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the reconstructed samples are vertically flipped, PDPC is applied to the bottom boundary and left boundary of the predictor according to the reordered reconstructed samples.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the reconstructed samples are vertically flipped, PDPC is applied to the top boundary and left boundary of the predictor according to the reordered reconstructed samples.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the reconstructed samples are horizontally flipped, PDPC is applied to the bottom boundary and left boundary of the predictor without considering reordered reconstructed samples.

Gradient PDPC in IBC Mode and IntraTMP Mode

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, gradient PDPC can also be applied. The application conditions can be the same as conditions in intra-prediction or different conditions from intra-prediction. A high-level flag may be signalled or inferred to indicate the application of gradient PDPC. For example, the high-level flag may correspond to SPS flag, PPS flag, frame-level flag, CTU-level flag or block-level flag.

In one embodiment, for blocks coded with IBC mode or IntraTMP mode, when the current block derives some intra-prediction angles, gradient PDPC is applied.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the current block derives some intra-prediction angles, gradient PDPC is applied but can be disabled according to a high-level flag.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, when the current block derives some intra-prediction angles, gradient PDPC is applied or not will depend on the high-level flag.

Chroma PDPC in IBC Mode and IntraTMP Mode

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, PDPC can also apply to chroma components. The application conditions can be the same as luma component or can be different conditions from luma component’s conditions. The blending weightings in chroma PDPC can be the same as luma PDPC or can be different weightings from luma PDPC. The blending lines can be the same as luma PDPC or can be different from luma PDPC. A high-level flag may be signalled or inferred to indicate the application of chroma PDPC. For example, the high-level flag may correspond to SPS flag, PPS flag, frame-level flag, CTU-level flag or block-level flag.

In one embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components. The chroma intra-prediction mode may follow luma intra-prediction mode or have its own mode. Or the derivation of chroma intra-prediction mode can be decoder-side intra mode derivation, such as DIMD or TIMD.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components or not according to block size constraints. When the current block area is too small, the current block width is smaller than a threshold or current the block height is smaller than a threshold, PDPC is disabled.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components or not according to block size constraints. When the current block area is too large, the current block width is larger than a threshold or the current block height is larger than a threshold, PDPC is disabled.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components or not according to MRL conditions. When non-first reference line is used, PDPC is applied to chroma components.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components or not according to MRL conditions. When non-first reference line is used, PDPC is disabled for chroma components.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components. The blending weightings in chroma PDPC may be increased according to block size conditions or neighbouring block conditions, such as prediction modes.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components. The blending weightings in chroma PDPC may be decreased according to block size conditions or neighbouring block conditions, such as prediction modes. The decreased blending weightings can be zero or non-zero.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components. The blending lines in chroma PDPC may be increased according to block size conditions or neighbouring block conditions, such as prediction modes.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components. The blending lines in chroma PDPC may be decreased according to block size conditions or neighbouring block conditions, such as prediction modes. The decreased blending lines can be zero or non-zero.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components or not will depend on a high-level flag. The high-level flag may be overwritten according to some conditions, such as block size, neighbouring block conditions, current block prediction mode or statistical information.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, PDPC is applied to chroma components or not will depend on a high-level flag. The high-level flag may be inferred according to some conditions, such as block size, neighbouring block conditions, current block prediction mode or statistical information.

Boundary Smoothing in IBC Mode and IntraTMP Mode

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, a boundary smoothing filtering is applied to refine the boundary of predictor. The boundary smoothing filter may be like deblocking filter or SAO filter (edge offset) or some other smoothing filters (e.g. mean filter and average filter) .

In one embodiment, for blocks coded with IBC mode or IntraTMP mode, smoothing filter is applied to refine the boundary of the predictor, such as mean filter, average filter or other commonly used smoothing filters.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, deblocking filter-like operation is applied to refine the boundary of the predictor.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, boundary strength is calculated, and filtering operation may be applied to the boundary of the predictor according to the boundary strength.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, SAO- like operation is applied to refine the boundary of the predictor.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, edge offset is calculated, and filtering operation may be applied to the boundary of the predictor according to the edge offset.

Conditional PDPC in IBC Mode and IntraTMP Mode

In the proposed method, for blocks coded with IBC mode or IntraTMP mode, PDPC is conditionally applied according to some conditions, such as differences between the predictor and neighbouring samples, SAD or SSE or SSE between the predictor and neighbouring samples. The pre-defined difference threshold can be explicitly signalled in SPS/PPS/PH/SH or implicitly derived. For example, the pre-defined difference threshold can be derived according to slice type, QP, and so on. If the condition meets, PDPC may be enabled or disabled.

In one embodiment, for blocks coded with IBC mode or IntraTMP mode, the differences between the whole predictor and neighbouring reconstructed samples are calculated. If the sum of differences is larger than the pre-defined difference threshold, disable PDPC.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, the differences between the whole predictor and neighbouring reconstructed samples are calculated. If the sum of differences is smaller than the pre-defined difference threshold, disable PDPC.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, the differences between the predictor and neighbouring reconstructed samples in the left and the top boundaries are calculated respectively. If the sum of differences is larger than one predefined threshold at one boundary, disable the blending process in PDPC from that boundary. Otherwise, the blending process from that boundary is applied to refine the predictor.

In another embodiment, for blocks coded with IBC mode or IntraTMP mode, the differences between the predictor and neighbouring reconstructed samples in the left and the top boundaries are calculated respectively. If the sum of differences is smaller than one predefined threshold at one boundary, disable the blending process in PDPC from that boundary. Otherwise, the blending process from that boundary is applied to refine the predictor.

Increasing the Variation of IBC Blending Modes (IBC-CIIP)

In one embodiment, combined IBC mode can generate the final predictor by blending two IBC merge predictors. For example, one predictor can be generated according to the BV of the left neighbour. The other predictor can be generated according to historical BV. After that, these two predictors are blended to generate IBC-CIIP final predictor. In another embodiment, combined IBC mode can generate the final predictor by blending one IBC merge predictor and one intraTMP predictor. In one embodiment, combined IBC mode can generate the final predictor by blending one IBC AMVP predictor and one intraTMP predictor.

In another embodiment, the blending weightings are designed based on the type of two predictors. For example, the first predictor is derived from the BV of the neighbouring block, and the second predictor is derived based on a non-adjacent candidate. During blending, the first predictor will apply larger weighting than the second predictor. In another embodiment, the blending weightings are designed based on the template matching cost. For example, the blending weight can be larger for the predictor with a lower template matching cost.

In another embodiment, the blending weightings are designed based on the BVs and/or the prediction modes of neighbouring CUs. For example, the blending weight of IBC predictor can be larger if the top and left CUs are coded by IBC mode. On the other hand, the blending weight of intra predictor can be larger if the top and left CUs are coded by intra mode.

In another embodiment, the blending weightings are designed based on the DIMD gradient amplitudes in the neighbouring region (e.g. 3 template lines) . For example, the blending weight of intra predictor can be larger if the DIMD gradient amplitude of the selected intra prediction mode is largest.

In another embodiment, the blending weightings (denoted as ω₀ and ω₁) are derived by regression model which minimizes the MSE between the blended template samples and the current template samples.

In one example, a blended template sample consists of K₀ (e.g., 2) spatial gradients, K₁ (e.g., 2) location terms, a nonlinear term, and a bias term. That is:

where X and Y are the vertical and horizontal relative locations, B is a midValue set to (1＜＜ (bitDepth-1) )

In another example, a blended template sample consists of T (e.g., 5) spatial samples, a nonlinear term, and a bias term. That is,

where B is a midValue set to (1＜＜ (bitDepth-1) ) .

Fig. 13 illustrates an example of IBC blending mode, andare derived using template samples 1310 and 1320 respectively. The two predictors are blended using weightings ω₀ and ω₁ to form the final predictor S_final.

Increasing the Variation of IBC-GPM Modes

In another embodiment, the final predictor of IBC-GPM mode can be generated by combining two different IBC merge predictors. For example, one predictor can be generated according to the BV of left neighbour. The other predictor can be generated according to historical BV. After that, these two predictors are combined to generate IBC-GPM final predictor. In another embodiment, the final predictor of IBC-GPM mode can be generated by combining one IBC AMVP predictor and one intra predictor. In another embodiment, the final predictor of IBC-GPM mode can be generated by combining one IBC AMVP predictor and one intraTMP predictor.

In another embodiment, TM based reordering can be performed to reorder IBC-GPM partition group, where the indexes of more promising candidates can be further reduced to improve the coding efficiency.

In one embodiment, the final predictor of IBC-GPM mode can be generated by combining one IBC AMVP/Merge predictor and one inter predictor. The inter predictor can be regular merge, MMVD (i.e., Merge with Motion Vector Difference) , TM merge, BM merge, CIIP (i.e., Combined Inter Merge/Intra Prediction) , Affine merge/AMVP, SbtMVP or other inter modes.

In one embodiment, the blending width of the IBC-GPM can be determined by the RD cost and be signalled by bits explicitly. Or the blending width of the IBC-GPM can be determined by the TM/BM cost and derived implicitly at decoder side. The number of blending width candidates can depend on the CU size, the prediction modes of two GPM parts or the blending width of the neighbouring GPM CUs. For example, if both GPM parts are IBC or intraTMP mode, the blending width should be small.

In one embodiment, the blending widths of two IBC-GPM parts can be different and determined separately. For example, if an IBC-GPM CU is composed of one IBC predictor and one inter predictor, the blending width of IBC/IntraTMP part can be small, and the blending width of inter part can be larger and selected from N blending width candidates.

In one embodiment, a joint IBC and regular merge candidate list is constructed, which includes original regular merge candidates and IBC candidates. The IBC-GPM predictor is generated by two merge predictors from the joint IBC and regular merge list, or one merge predictor from the joint list and the other predictor from the intra prediction mode (IPM) candidate list or TM merge candidate list.

In one embodiment, the IBC-GPM mode can be viewed as an inter mode and share the same signalling syntax with inter GPM mode. The subblock motion vectors of the IBC-GPM CU will be determined according to the non-IBC part. For example, if an IBC-GPM CU is composed of one IBC predictor and one inter predictor, the subblock motion vectors of the entire CU are set same as the inter predictor.

Improving IBC-LIC mode

In one embodiment, IBC-LIC mode can be enabled with RR-IBC mode, where the templates of IBC-LIC shall be derived according to RR-IBC’s type. For example, if horizontal flip is enabled, the templates of IBC-LIC shall be the samples on the right side of referenced block and the samples on the top of referenced block. For example, if vertical flip is enabled, the templates of IBC-LIC shall be the samples on the left side of referenced block and the samples on the bottom of referenced block.

IBC with NAMVP

In one embodiment, non-adjacent spatial candidates can be used in IBC merge. Only if the to-be referenced blocks are in the valid referenced region, they can be referenced. In one embodiment, the positions of non-adjacent candidates of IBC are designed related to the boundary position of the IBC’s valid referenced region.

In one embodiment, the similarity check can be applied to NAMVP list derivation to increase the diversity of NAMVPs. The threshold of similarity check of NAMVP list can be different from IBC merge list. For example, a strict threshold will be used during the NAMVP list generation. For example, only if the to-be inserted candidate is N pixels away from any other candidate in the list, it can be inserted to the list and N is an integer larger than zero. After that, the candidates in NAMVP list are inserted to IBC merge candidate list with loose threshold.

IBC with Diversity Reordering

In the reordering process, a candidate is considered as redundant if the cost difference between a candidate and its predecessor is inferior to a lambda value. For example, D1 and D2 are the costs obtained for a candidate and its predecessor during the first ARMC ordering and λ is the Lagrangian parameter used in the RD criterion at encoder side. If |D1-D2| < λ, then the candidate is redundant.

In one embodiment, if the reordering process is applied to IBC merge candidate list, the lambda value can be different from the lambda value used for the inter merge candidate list. For example, lambda values used on IBC merge list are N times larger than lambda values used on the inter merge candidate list. For another example, lambda values used for IBC merge list are lambda values used for the inter merge candidate list adding an offset. For another example, lambda values used for IBC merge list can be derived by a linear function with lambda values used on inter merge candidate list. For another embodiment, the lambda values used on IBC merge list can be designed according to the current CU size, bits depth or picture resolution.

Signalling the IBC Modes as Inter Sub-modes

In VVC, only IBC inter and IBC merge modes are adopted and the syntaxes of IBC and Inter modes are designed separately. As shown in Fig. 14, an IBC flag is signalled before all the IBC modes and inter modes to indicate whether the current CU is IBC or inter mode. After that, the sub-mode flags (e.g., geoFlag, mmvdFlag, ciipFlag, etc. ) and motion information are then signalled. However, as more and more IBC modes are proposed, the current syntax design of IBC modes might not be optimal.

In one invention, IBC modes are signalled as Inter sub-modes as shown in Fig. 15. That is, the IBC flag is signalled after each Inter sub-mode flag. In one example, the IBC flag is signalled after the subblock merge flag to indicate whether the current CU is IBC or inter subblock merge mode.

In another example, the IBC flag is signalled after the geoFlag to indicate whether the current CU is IBC or inter GPM mode. In this case, the signalling of geoFlag and also the following merge indices, partition index and blending index is shared between these two modes to reduce the redundancy in original IBC GPM syntax design. In the original IBC GPM syntax design the syntax of IBC-GPM mode is independent of Inter GPM.

In this invention, the changes of the IBC syntax design can be applied to all the merge modes including GPM, MMVD (i.e., MBVD for IBC) , TM, CIIP and subblock merge modes and the AMVP modes including uni-prediction AMVP, bi-prediction AMVP, SMVD and affine AMVP.

In another embodiment, the transform syntax for an IBC coded block is signalled by treating IBC modes as inter sub-modes. For an example of signalling the MTS index, a first coding bin of the MTS index indicates whether to use MTS types for horizontal or vertical transform direction. In one sub-embodiment, the first coding bin uses different contexts for intra and inter (including IBC) . In response to the first coding bin indicating to use MTS types, a follow-up checking is to distinguish whether the current block is coded with mode intra type (only including intra prediction modes) , or mode inter type (including inter prediction modes and IBC prediction modes) . In one sub-embodiment, the available MTS candidates for inter coded blocks and IBC coded blocks are unified. In another sub-embodiment, the number of available MTS candidates and/or size constraint to MTS are the same between inter and IBC prediction modes. In another sub-embodiment, the contexts used in the subsequent coding bins (following the first coding bin) are shared with inter and IBC prediction modes. In another sub-embodiment, the contexts used in the subsequent coding bins (following the first coding bin) are different between inter and IBC prediction modes. That is, inter and IBC have their own contexts, respectively. In another sub-embodiment, the number of available MTS candidates and/or size constraint to MTS are different between inter and IBC prediction modes. When a pre-defined transform mode is supported in IBC, the above proposed methods can be used for the pre-defined transform mode. For example, the pre-defined transform mode can be MTS, SBT, secondary transform, JCCR, NSPT, any transform mode in the standard, and/or any subset or extension of the mentioned transform modes.

No IBC Flag Signalling by Merging IBC Modes to Intra Mode

In this invention, a joint IBC and Intra candidate list is proposed to remove the need for IBC flag signalling. In a joint IBC and Intra candidate list, the IBC candidates including the IBC spatial, non-adjacent and HMVP candidates are inserted into the Intra MPM candidate list. The candidate reordering by TM cost can be performed to constrain the size of the candidate list. When a CU is coded by IBC mode, the BV information and the candidate indices (i.e., one or more IBC candidates can be selected. ) are set and the BV, IBC-LIC flag, RR-IBC flag and other BV information are spanned to each 4x4 subblocks in the CU. For other Intra CUs, the IBC-coded CU can also be referenced. In this case, the IBC modes can be viewed as Intra mode, and the signalling of IBC flag can be omitted.

In one embodiment, the RR-IBC flag is only signalled when the CU is coded by IBC AMVP mode.

In another embodiment, the IBC-LIC flag is only signalled when the CU is coded by IBC.

In another embodiment, the IBC-OBMC flag is only signalled when the CU is coded by IBC.

No IBC Flag Signalling by Merging IBC Modes with Affine Mode

In this invention, a joint IBC and Affine candidate list is proposed to remove the need for IBC flag signalling. In a joint IBC and Affine candidate list, the IBC candidates including the IBC spatial, non-adjacent and HMVP candidates are inserted into the Affine candidate list. The candidate reordering according to TM cost can be performed to constrain the size of the candidate list. The current frame is included into the reference list and the reference pictures of the IBC candidates are set to the current frame. When a CU is coded by IBC mode, the BV information, candidate indices, CPMVs, reference indices and the inter prediction indicator are set and the BV, IBC-LIC flag, RR-IBC flag and other BV information are spanned to each 4x4 subblocks in the CU.For other Affine CUs, the IBC-coded CU can also be referenced. In this case, the IBC modes can be viewed as Affine mode and the signalling of IBC flag can be removed.

In one embodiment, the IBC candidates are added into Inter, Intra and Affine candidate list. The IBC candidates can exist in all the three candidate lists which is not conflicting.

No IBC Flag Signalling by Merging IBC Modes to Joint Mode

In this invention, a joint IBC, Inter, Intra and Affine candidate list is proposed to remove the need for IBC flag signalling. In a joint IBC, Inter, Intra and Affine candidate list, all the IBC, Inter, Intra and Affine candidates are included. The candidate reordering by TM (Template Matching) cost can be performed to constrain the size of the candidate list. For example, the candidates in the list are re-ordered according to the TM cost and only a subset of the candidates with the lowest TM cost are kept in the list to reduce the total number of candidates. The current frame is included into the reference list and the reference pictures of the IBC candidates are set to the current frame. When a candidate is selected from the joint candidate list, the corresponding MV information, BV information, CPMV information, candidate indices, reference indices and the inter prediction indicator are set and the MV, BV, IBC-LIC flag, RR-IBC flag and other motion information are spanned to each 4x4 subblocks in the CU. When constructing the joint candidate list for other CUs, the IBC-coded CU can also be referenced. In this case, all the coding modes can be viewed as a joint mode including IBC and the signalling of IBC flag can be omitted.

Aligning Syntax Design of IBC Mode to Inter Mode

In current syntax design, the order of IBC mode syntax and Inter mode syntax is not aligned. For example, the TM merge flag is signalled before MMVD merge flag for Inter mode in VVC. However, for IBC mode, the TM merge flag is signalled after MBVD merge flag which is different from Inter mode and leads to difficulty in hardware design.

In one embodiment, the TM merge flag is signalled before MBVD merge flag and MBVD merge index.

In another embodiment, the IBC-LIC flag is signalled after all IBC mode flags including IBC merge modes and IBC AMVP modes similar to Inter-LIC syntax.

Syntax Design for IBC LIC and OBMC Flags

In the current design, the IBC-LIC flag and the Inter-LIC flag are signalled separately and the Inter-LIC flag and OBMC flag are mutually exclusive. To align the syntax design between IBC and Inter modes, in one invention, the LIC and/or OMBC flag for IBC and Inter modes are shared to remove the signalling redundancy.

In one embodiment, the IBC-LIC flag and IBC OMBC flag are mutually exclusive. That is, when IBC-LIC is enabled, the OMBC mode is disabled for the current CU.

In another embodiment, the IBC-LIC flag and IBC OMBC flag are independent. That is, when IBC-LIC is enabled, the OMBC mode can be enabled or disabled.

The foregoing proposed method can be implemented in encoders and/or decoders. For example, the proposed method can be implemented in an inter prediction module and/or intra block copy prediction module of an encoder, and/or an inter prediction module (and/or intra block copy prediction module) of a decoder.

The foregoing proposed method related to IBC coding and signalling can be implemented in encoders and/or decoders. For example, the proposed method can be implemented in an inter prediction module and/or intra block copy prediction module of an encoder, and/or an inter prediction module (and/or intra block copy prediction module) of a decoder. Any of the proposed IBC coding and signalling designs can be implemented in an Intra/Inter coding module (e.g., Intra Pred. 150/MC 152 in Fig. 1B) in a decoder or an Intra/Inter coding module is an encoder (e.g. Intra Pred. 110/Inter Pred. 112 in Fig. 1A) . Any of the proposed IBC coding and signalling methods can also be implemented as a circuit coupled to the intra/inter coding module at the decoder or the encoder. However, the decoder or encoder may also use additional processing unit to implement the required cross-component prediction processing. While the Intra Pred. units (e.g., unit 110/112 in Fig. 1A and unit 150/152 in Fig. 1B) are shown as individual processing units, they may correspond to executable software or firmware codes stored on a media, such as hard disk or flash memory, for a CPU (Central Processing Unit) or programmable devices (e.g. DSP (Digital Signal Processor) or FPGA (Field Programmable Gate Array) ) .

Fig. 16 illustrates a flowchart of an exemplary video coding system that generates a final predictor of IBC-GPM mode by combining one IBC AMVP/Merge predictor and one inter predictor according to an embodiment of the present invention. The steps shown in the flowchart may be implemented as program codes executable on one or more processors (e.g., one or more CPUs) at the encoder side. The steps shown in the flowchart may also be implemented based hardware such as one or more electronic devices or processors arranged to perform the steps in the flowchart. According to the method, input data associated with a current block are received in step 1610, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side. The current block is partitioned into a first sub-partition and a second sub-partition in step 1620. First prediction is derived for the first sub-partition in step 1630, wherein the first prediction comprises an IBC-based (Intra Block Copy based) predictor. Second prediction is derived for the second sub-partition in step 1640, wherein the second prediction comprises an inter predictor. The first sub-partition and the second sub-partition of the current block are encoded or decoded using information comprising the first prediction and the second prediction respectively in step 1650.

Fig. 17 illustrates a flowchart of an exemplary video coding system that uses a joint IBC and Intra candidate list according to an embodiment of the present invention. According to another method, input data associated with a current block are received in step 1710, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side. A prediction candidate list is derived in step 1720, wherein the prediction candidate list comprises one or more IBC-based (Intra Block Copy based) prediction candidates and one or more intra prediction candidates. The current block is encoded or decoded in an IBC mode or an intra prediction mode using information comprising the prediction candidate list in step 1730.

Fig. 18 illustrates a flowchart of an exemplary video coding system that signals the transform syntax for an IBC coded block by treating IBC modes as inter sub-modesaccording to an embodiment of the present invention. According to yet another method, input data associated with a current block are received in step 1810, wherein the input data comprise pixel data or residual data to be encoded at an encoder side or transformed data associated with the current block to be decoded at a decoder side, and the current block is coded in an IBC (Intra Block Copy) mode. A transform syntax is signalled or parsed in a same way as an inter-coded block in step 1820, wherein the transform syntax is associated with transform information for the current block. Forward transform or inverse transform is applied to the current block using the transform information in step 1830.

The flowcharts shown are intended to illustrate an example of video coding according to the present invention. A person skilled in the art may modify each step, re-arranges the steps, split a step, or combine steps to practice the present invention without departing from the spirit of the present invention. In the disclosure, specific syntax and semantics have been used to illustrate examples to implement embodiments of the present invention. A skilled person may practice the present invention by substituting the syntax and semantics with equivalent syntax and semantics without departing from the spirit of the present invention.

The above description is presented to enable a person of ordinary skill in the art to practice the present invention as provided in the context of a particular application and its requirement. Various modifications to the described embodiments will be apparent to those with skill in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the particular embodiments shown and described but is to be accorded the widest scope consistent with the principles and novel features herein disclosed. In the above detailed description, various specific details are illustrated in order to provide a thorough understanding of the present invention. Nevertheless, it will be understood by those skilled in the art that the present invention may be practiced.

Embodiment of the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program code to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA) . These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware code may be developed in different programming languages and different formats or styles. The software code may also be compiled for different target platforms. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims

A method of video coding, the method comprising:

receiving input data associated with a current block, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side;

partitioning the current block into a first sub-partition and a second sub-partition;

deriving first prediction for the first sub-partition, wherein the first prediction comprises an IBC-based (Intra Block Copy based) predictor;

deriving second prediction for the second sub-partition, wherein the second prediction comprises an inter predictor; and

encoding or decoding the first sub-partition and the second sub-partition of the current block using information comprising the first prediction and the second prediction respectively.
The method of Claim 1, wherein the IBC-based predictor corresponds to an IBC AMVP (Adaptive Motion Vector Prediction) predictor or an IBC Merge predictor.
The method of Claim 1, wherein the inter predictor corresponds to a regular merge predictor, MMVD (Merge with Motion Vector Difference) predictor, TM merge predictor, BM (Block Matching) merge predictor, CIIP (Combined Inter Merge/Intra Prediction) predictor, Affine merge predictor, AMVP predictor, or SbtMVP (Subblock-based Temporal Motion Vector Prediction) predictor.
The method of Claim 1, wherein a joint IBC and regular merge candidate list is constructed by including at least one IBC candidate and at least one regular merge candidate, and at least one target merge candidate is selected from the joint IBC and regular merge candidate list for one of the first sub-partition and the second sub-partition.
The method of Claim 1, wherein a second target candidate from an intra prediction mode candidate list or a TM (Template Matching) merge candidate list for another of the first sub-partition and the second sub-partition.
An apparatus for video coding, the apparatus comprising one or more electronics or processors arranged to:

receive input data associated with a current block, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side;

partition the current block into a first sub-partition and a second sub-partition;

derive first prediction for the first sub-partition, wherein the first prediction comprises an IBC-based (Intra Block Copy based) predictor;

derive second prediction for the second sub-partition, wherein the second prediction comprises an inter predictor; and

encode or decode the first sub-partition and the second sub-partition of the current block using information comprising the first prediction and the second prediction respectively.
A method of video coding, the method comprising:

receiving input data associated with a current block, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side;

deriving a prediction candidate list, wherein the prediction candidate list comprises one or more IBC-based (Intra Block Copy based) prediction candidates and one or more intra prediction candidates; and

encoding or decoding the current block in an IBC mode or an intra prediction mode using information comprising the prediction candidate list.
The method of Claim 7, wherein when the current block is coded in the IBC mode, a target candidate is selected from the prediction candidate list without a need for signalling or parsing a flag indicating the IBC mode.
The method of Claim 7, wherein said one or more IBC-based prediction candidates comprise one or more IBC spatial candidates, one or more non-adjacent candidates, one or more HMVP (History-based Moton Vector Prediction) candidates, or a combination thereof.
The method of Claim 7, wherein member candidates in the prediction candidate list are reordered according to TM (Template Matching) cost and a subset of the member candidates with lowest TM costs are kept in the prediction candidate list.
The method of Claim 7, wherein when the current block is coded in the intra prediction mode, the current block is allowed to reference one or more previously coded blocks in the IBC mode.
An apparatus for video coding, the apparatus comprising one or more electronics or processors arranged to:

receive input data associated with a current block, wherein the input data comprise pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side;

derive a prediction candidate list, wherein the prediction candidate list comprises one or more IBC-based (Intra Block Copy based) prediction candidates and one or more intra prediction candidates; and

encode or decode the current block in an IBC mode or an intra prediction mode using information comprising the prediction candidate list.
A method of video coding, the method comprising:

receiving input data associated with a current block, wherein the input data comprise pixel data or residual data to be encoded at an encoder side or transformed data associated with the current block to be decoded at a decoder side, and wherein the current block is coded in an IBC (Intra Block Copy) mode;

signalling or parsing a transform syntax in a same way as an inter-coded block, wherein the transform syntax is associated with transform information for the current block; and

applying forward transform or inverse transform to the current block using the transform information.
The method of Claim 13, wherein the transform syntax corresponds to available MTS (Multiple Transform Selection) candidates and the available MTS candidates are the same for an IBC coded block and the inter-coded block.
The method of Claim 13, wherein the transform syntax corresponds to a number of MTS (Multiple Transform Selection) candidates or size constraint for MTS, and the number of MTS (Multiple Transform Selection) candidates or the size constraint for MTS is the same for an IBC coded block and the inter-coded block.
The method of Claim 13, wherein the transform syntax corresponds to a pre-defined transform mode.
The method of Claim 16, wherein the pre-defined transform mode corresponds to MTS (Multiple Transform Selection) , SBT (Subblock Transform) , secondary transform, JCCR (Joint coding of chrominance residuals) , NSPT (Non-separable Primary Transform) , or a combination thereof.
An apparatus for video coding, the apparatus comprising one or more electronics or processors arranged to:

receive input data associated with a current block, wherein the input data comprise pixel data or residual data to be encoded at an encoder side or transformed data associated with the current block to be decoded at a decoder side, and wherein the current block is coded in an IBC (Intra Block Copy) mode;

signal or parse a transform syntax in a same way as an inter-coded block, wherein the transform syntax is associated with transform information for the current block; and

apply forward transform or inverse transform to the current block using the transform information.