HK1220531B

HK1220531B - Advanced screen content coding solution

Info

Publication number: HK1220531B
Application number: HK16108372.3A
Authority: HK
Inventors: 马展; 王炜; 于浩平; 王显; 夜静
Original assignee: 华为技术有限公司
Priority date: 2013-11-22
Filing date: 2014-11-24
Publication date: 2020-08-14

Description

Advanced Screen Content Encoding Scheme

技术领域Technical Field

本发明通常针对屏幕内容编码。The present invention is generally directed to screen content encoding.

背景技术Background Art

相比于传统的自然视频，因为信号特征的不同，屏幕内容编码对视频压缩技术提出新的挑战。似乎有几个针对高级屏幕内容编码的有前景的技术，例如伪字符串匹配、调色板编码、帧内运动补偿或者帧内块拷贝。Compared to traditional natural videos, screen content coding poses new challenges to video compression techniques due to the different signal characteristics. There seem to be several promising techniques for advanced screen content coding, such as pseudo string matching, palette coding, intra-frame motion compensation, or intra-frame block copying.

在这些技术中，对于无损编码，伪字符串匹配收益最高，但是复杂度开销较大并且在有损编码模式上存在困难。基于非相机捕捉内容通常包含有限颜色而非自然视频中的连续色调的假设，为屏幕内容开发调色板编码。尽管所述伪字符串匹配和调色板编码方法显示巨大的潜力，但是屏幕内容编码的工作草案(working draft，WD)版本4和正在开发的HEVC范围扩展(HEVC range extension，HEVC RExt)的参考软件采用帧内运动补偿或者帧内块拷贝。这主要是因为对运动估计和补偿方法的广泛研究已经有几十年，并且其观点和实际执行相当容易(尤其对于硬件)。Among these techniques, pseudo string matching has the highest benefit for lossless coding, but has a large complexity overhead and has difficulties in lossy coding modes. Based on the assumption that non-camera captured content usually contains limited colors rather than continuous tones in natural video, palette coding is developed for screen content. Although the pseudo string matching and palette coding methods show great potential, the working draft (WD) version 4 of screen content coding and the reference software of the HEVC range extension (HEVC RExt) under development use intra-frame motion compensation or intra-frame block copy. This is mainly because motion estimation and compensation methods have been widely studied for decades, and their concept and practical implementation are quite easy (especially for hardware).

但是帧内块拷贝的编码性能因为其固定的块结构划分而受限。另一方面，块匹配的执行，有时与帧内图像的运动估计相似，也会使编码器的计算和存储访问的复杂度显著增加。However, the coding performance of intra block copy is limited by its fixed block structure. On the other hand, the implementation of block matching, which is sometimes similar to motion estimation for intra images, also significantly increases the computational and memory access complexity of the encoder.

发明内容Summary of the Invention

本发明是针对高级屏幕内容编码方案。The present invention is directed to an advanced screen content encoding scheme.

在一个示例性实施例中，一种将屏幕内容编码成比特流的方法包括：为屏幕内容的编码单元(coding unit，CU)选择调色板颜色表。为所述CU创建了所述调色板颜色表，并且为相邻CU创建了调色板颜色表。利用所述选择的调色板颜色表为所述屏幕内容的编码单元(coding unit，CU)创建了有索引的颜色索引图。针对多个CU中的每一个CU，将所述选择的调色板颜色表和所述颜色索引图编码/压缩成了比特流。In an exemplary embodiment, a method for encoding screen content into a bitstream includes: selecting a palette color table for a coding unit (CU) of the screen content. Creating the palette color table for the CU and creating palette color tables for neighboring CUs. Creating an indexed color index map for the coding unit (CU) of the screen content using the selected palette color table. Encoding/compressing the selected palette color table and the color index map into a bitstream for each of a plurality of CUs.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更完整地理解本发明及其优点，现在参考下文结合附图进行的描述，相同的数字表示相同的对象，其中：For a more complete understanding of the present invention and its advantages, reference is now made to the following description taken in conjunction with the accompanying drawings, wherein like numerals represent like objects and wherein:

图1示出了本发明一个示例性实施例中利用调色板颜色表和索引图模式或者调色板模式的屏幕内容编码方案；FIG1 shows a screen content encoding scheme using a palette color table and an index map mode or a palette mode in an exemplary embodiment of the present invention;

图2示出了针对调色板颜色表和索引图模式或者调色板模式的屏幕内容解码方案；FIG2 shows a screen content decoding scheme for a palette color table and index map mode or a palette mode;

图3示出了CU的调色板颜色表和索引图模式或者调色板模式的屏幕内容方案的过程或者工作流程；FIG3 shows a process or workflow of a CU's palette color table and screen content scheme in index map mode or palette mode;

图4示出了传统的平面模式(左侧)中的G、B、R进入填充模式(右侧)；Figure 4 shows G, B, R in traditional planar mode (left) entering fill mode (right);

图5示出了利用相邻重构块重新生成调色板颜色表；FIG5 shows regenerating a palette color table using adjacent reconstructed blocks;

图6示出了从实际字幕内容中解析的索引图；FIG6 shows an index map parsed from actual subtitle content;

图7示出了水平扫描后1-D搜索的一个片段；Figure 7 shows a fragment of a 1-D search after a horizontal scan;

图8示出了U_PIXEL模块；Figure 8 shows the U_PIXEL module;

图9示出了U_ROW模块；Figure 9 shows the U_ROW module;

图10示出了U_CMP模块；Figure 10 shows the U_CMP module;

图11示出了U_COL模块；Figure 11 shows the U_COL module;

图12示出了U_2D_BLOCK模块；Figure 12 shows the U_2D_BLOCK module;

图13示出了示例性CU的索引图的水平和垂直扫描处理；FIG13 illustrates the horizontal and vertical scanning process of an index map of an exemplary CU;

图14A示出了4:2:0色度采样格式；FIG14A shows a 4:2:0 chroma sampling format;

图14B示出了4:4:4色度采样格式；FIG14B shows a 4:4:4 chroma sampling format;

图15示出了4:2:0和4:4:4之间的插值；Figure 15 shows interpolation between 4:2:0 and 4:4:4;

图16示出了有上侧/左侧行缓存的索引图处理；FIG16 illustrates index map processing with top/left row buffering;

图17示出了应用于当前HEVC的装置和方法/流程；FIG17 shows an apparatus and method/process applied to current HEVC;

图18示出了通信系统的一个示例；FIG18 shows an example of a communication system;

图19A和19B示出了本发明提供的可以执行所述方法和理念的示例性设备。19A and 19B illustrate exemplary devices provided by the present invention that can implement the methods and concepts described.

具体实施方式DETAILED DESCRIPTION

在本发明中，描述了一种优于高性能视频编码(High-Efficiency Video Coding，HEVC)范围扩展(例如，HEVC版本2或者HEVC RExt)的高级屏幕内容编码方案。这个新方案包括多个专门为屏幕内容编码而设计的算法。这些算法包括利用调色板或者颜色表，这里也称作调色板颜色表，调色板颜色压缩，颜色索引图压缩，串搜索和残差压缩的像素表示。该技术协调发展，并且能够与HEVC范围扩展(RExt)和未来HEVC扩展集成，以支持高效屏幕内容编码。但是该技术可以利用任何现有视频制式执行。为简单起见，下文描述中以HEVCRExt为例，利用HEVC RExt软件来描述并证明压缩效率。利用调色板颜色表和索引图，这里定义为调色板模式，将该方案集成为HEVC中的附加模式以证明所述性能。In the present invention, an advanced screen content coding scheme that is superior to High-Efficiency Video Coding (HEVC) range extension (e.g., HEVC version 2 or HEVC RExt) is described. This new scheme includes multiple algorithms designed specifically for screen content coding. These algorithms include pixel representation using a palette or color table, also referred to herein as a palette color table, palette color compression, color index map compression, string search, and residual compression. The technology is developed in a coordinated manner and can be integrated with HEVC range extension (RExt) and future HEVC extensions to support efficient screen content coding. However, the technology can be implemented using any existing video standard. For simplicity, HEVC RExt is used as an example in the following description to describe and demonstrate compression efficiency using HEVC RExt software. Using a palette color table and index map, defined herein as palette mode, the scheme is integrated as an additional mode in HEVC to demonstrate the performance.

在附图中示出了本发明的概念和描述。根据本发明，图1示出了具有包括存储器的处理器12的编码器10，图2示出了具有处理器16和存储器的解码器14，并且分别示出了针对所述调色板模式的编码和解码方案的示例性实施例。如图所示，每个编码器10和解码器14包括处理器和存储器，并且组成编解码方案。所述编解码方案包括：所述编码器10的处理器12执行新的算法或者方法，所述算法或者方法包括：处理1：创建调色板颜色表；处理2：针对对应的颜色索引，利用之前获取的调色板颜色表对颜色或者像素值进行分类；处理3：编码所述调色板颜色表；处理4：编码所述颜色索引图；处理5：编码所述残差；处理6：将新的语法元素写入压缩的比特流。解码器14的处理器16执行包括所述逆向步骤的新算法或者方法。图3根据本发明提供屏幕内容解决方案的过程或者工作流程。The concepts and descriptions of the present invention are illustrated in the accompanying drawings. According to the present invention, FIG. 1 shows an encoder 10 having a processor 12 including a memory, and FIG. 2 shows a decoder 14 having a processor 16 and a memory, each showing exemplary embodiments of encoding and decoding schemes for the palette mode. As shown, each encoder 10 and decoder 14 includes a processor and a memory and constitutes a coding and decoding scheme. The coding and decoding scheme includes: the processor 12 of the encoder 10 executing a new algorithm or method, the algorithm or method including: Process 1: Creating a palette color table; Process 2: Classifying colors or pixel values using a previously acquired palette color table for corresponding color indexes; Process 3: Encoding the palette color table; Process 4: Encoding the color index map; Process 5: Encoding the residual; Process 6: Writing new syntax elements to the compressed bitstream. The processor 16 of the decoder 14 executes the new algorithm or method including the inverse steps described above. FIG. 3 provides a process or workflow for a screen content solution according to the present invention.

基本上，对每个编码单元(coding unit，CU)执行高效调色板压缩(color palettecompression，CPC)方法。编码单元是HEVC和HEVC RExt中的基本操作单元，其是三个部分(即RGB、YUV或者XYZ)组成的像素的方形块。Basically, an efficient color palette compression (CPC) method is performed on each coding unit (CU). A coding unit is the basic unit of operation in HEVC and HEVC RExt, which is a square block of pixels composed of three components (i.e., RGB, YUV, or XYZ).

在每个CU级别，所述CPC方法主要包括两个步骤：首先，在第一步骤中，所述处理器12获取或者生成调色板颜色表。该表是根据柱状图(即每个颜色值的出现频率)、其实际颜色值或者任意方法进行排列的，以提高以下编码过程的效率。基于获取的调色板颜色表，将原始CU中的每个像素转换成了所述调色板颜色表中的颜色索引。本发明的一个贡献是通过压缩等将每个CU的调色板颜色表和颜色索引图有效地编码成比特流的技术。在接收器侧，所述处理器16解析压缩的比特流以为每个CU重构完整的调色板颜色表和颜色索引图，然后通过将所述颜色索引和调色板颜色表合并在每个位置进一步获取像素值。At each CU level, the CPC method mainly includes two steps: First, in the first step, the processor 12 obtains or generates a palette color table. The table is arranged according to a histogram (i.e., the frequency of occurrence of each color value), its actual color value, or an arbitrary method to improve the efficiency of the following encoding process. Based on the acquired palette color table, each pixel in the original CU is converted into a color index in the palette color table. One contribution of the present invention is a technology for efficiently encoding the palette color table and color index map of each CU into a bit stream through compression, etc. On the receiver side, the processor 16 parses the compressed bit stream to reconstruct the complete palette color table and color index map for each CU, and then further obtains the pixel value at each position by merging the color index and palette color table.

在本发明的示例中，假设一个CU有NxN个像素(N＝8、16、32、64以与HEVC兼容)。所述CU通常包含采样比不同(即4:4:4、4:2:2、4:2:0)的3个色度(chroma)成分(即G、B、R；Y、Cb、Cr；X、Y、Z)。为简单起见，本发明示出了4:4:4序列。对于4:2:2和4:2:0视频序列，色度上采样可以用于获得所述4:4:4序列，或者每个颜色成分可以单独处理。然后可以应用本发明中描述的相同程序。对于4:0:0单色视频，这可以视为没有其他两个平面的4:4:4的单个平面。针对4:4:4的所有方法都可以直接应用。In the example of the present invention, it is assumed that a CU has NxN pixels (N=8, 16, 32, 64 to be compatible with HEVC). The CU typically contains 3 chroma components (i.e. G, B, R; Y, Cb, Cr; X, Y, Z) with different sampling ratios (i.e. 4:4:4, 4:2:2, 4:2:0). For simplicity, the present invention shows a 4:4:4 sequence. For 4:2:2 and 4:2:0 video sequences, chroma upsampling can be used to obtain the 4:4:4 sequence, or each color component can be processed separately. The same procedure described in the present invention can then be applied. For 4:0:0 monochrome video, this can be regarded as a single plane of 4:4:4 without the other two planes. All methods for 4:4:4 can be applied directly.

填充或者平面Fill or plane

图1中示出了针对块CTU或者CU的方法。首先为每个CU定义称作enable_packed_component_flag的标志，以指示当前CU是以填充形式还是传统的平面模式(即G、B、R或者Y、U、V成分都是单独处理)进行处理。图4示出了传统的平面模式(左侧)中的G、B、R进入填充模式(右侧)。YUV或者其他颜色格式可以以与RGB内容所示的处理形式相同的形式进行处理。Figure 1 illustrates this approach for a CTU or CU. First, a flag called enable_packed_component_flag is defined for each CU to indicate whether the CU is being processed in packed mode or in traditional planar mode (i.e., G, B, R or Y, U, V components are processed separately). Figure 4 shows G, B, and R in traditional planar mode (left) transitioning to packed mode (right). YUV or other color formats can be processed in the same manner as shown for RGB content.

所述填充模式和平面模式都有各自的优点和缺点。例如，所述平面模式支持对G/B/R或者Y/U/V的并行颜色成分处理。但是可能会存在编码效率较低的问题。所述填充模式可以在不同的颜色成分之间共享该CU的头信息(例如本发明中的调色板颜色表和索引图)。但是可能会打断并行性。确定当前CU是否应该以填充形式编码的一个简单方法是测量率失真(R-D)代价。所述enable_packed_component_flag用于将编码模式显式发送至解码器。The fill mode and the plane mode each have their own advantages and disadvantages. For example, the plane mode supports parallel color component processing of G/B/R or Y/U/V. However, there may be a problem of low coding efficiency. The fill mode can share the header information of the CU (such as the palette color table and index map in the present invention) between different color components. However, parallelism may be interrupted. A simple way to determine whether the current CU should be encoded in a filled form is to measure the rate-distortion (R-D) cost. The enable_packed_component_flag is used to explicitly send the coding mode to the decoder.

另外，为了定义CU级别的enable_packed_component_flag用于低级别处理，根据特定的应用要求，其可以在条带头部甚或序列级别(如序列参数集或者图像参数集)进行复制以允许条带级别或者序列级别处理。In addition, in order to define the enable_packed_component_flag at the CU level for low-level processing, it can be copied at the slice header or even the sequence level (such as sequence parameter set or picture parameter set) to allow slice level or sequence level processing according to specific application requirements.

调色板颜色表和索引图获取Palette color table and index map acquisition

如图1所示，在处理1和处理3中，每个CU的像素位置都是横向的，获取了后续处理过程中的所述调色板颜色表和索引图。根据柱状图(即出现频率)、颜色值或者任意方法在所述调色板颜色表中排列每个不同的颜色，以提高接下来的编码过程的效率。例如，如果在所述编码过程中使用差值脉冲编码调制(differential pulse code modulation，DPCM)方法编码邻近像素之间的差异，且如果所述邻近像素分配有所述调色板颜色表中的邻近颜色索引，则可以获得最佳编码结果。As shown in Figure 1, in process 1 and process 3, the pixel position of each CU is horizontal, and the palette color table and index map in the subsequent processing process are obtained. Each different color is arranged in the palette color table according to a histogram (i.e., frequency of occurrence), color value, or any method to improve the efficiency of the subsequent encoding process. For example, if the differential pulse code modulation (DPCM) method is used to encode the difference between adjacent pixels in the encoding process, and if the adjacent pixels are assigned adjacent color indexes in the palette color table, the best encoding result can be obtained.

在获得所述调色板颜色表之后，每个像素映射到对应的颜色索引以形成当前CU的索引图。索引图的处理在后续部分进行描述。After obtaining the palette color table, each pixel is mapped to a corresponding color index to form an index map of the current CU. The processing of the index map is described in the subsequent section.

对于传统的平面CU，每个颜色或者色度成分可以有自己的调色板颜色表，如colorTable_Y、colorTable_U、colorTable_V、colorTable_R、colorTable_G或者colorTable_B，这里例举一部分作为示例。同时，可以获取主要成分例如YUV中的Y或者GBR中的G的调色板颜色表并且将其共享给所有成分。通常经过此次共享，除了Y或者G的其他颜色成分可能会与调色板颜色表中共享成分存在一些有关原始像素颜色的不匹配。然后残差引擎(如HEVC系数编码方法)可以用于编码这些不匹配的残差。另一方面，对于填充CU，单个调色板颜色表在所有成分之间共享。For traditional planar CUs, each color or chroma component can have its own palette color table, such as colorTable_Y, colorTable_U, colorTable_V, colorTable_R, colorTable_G, or colorTable_B, some of which are listed here as examples. At the same time, the palette color table of the main component, such as Y in YUV or G in GBR, can be obtained and shared with all components. Usually after this sharing, other color components except Y or G may have some mismatches with the shared components in the palette color table regarding the original pixel color. The residual engine (such as the HEVC coefficient encoding method) can then be used to encode these mismatched residuals. On the other hand, for a padded CU, a single palette color table is shared among all components.

提供一种证明所述调色板颜色表和索引图获取的伪代码如下：A pseudo code is provided to prove the acquisition of the palette color table and index map as follows:

deriveColorTableIndexMap()deriveColorTableIndexMap()

{{

deriveColorTable()；deriveColorTable();

deriveIndexMap()；deriveIndexMap();

}}

deriveColorTable(src,cuWidth,cuHeight,maxColorNum)deriveColorTable(src,cuWidth,cuHeight,maxColorNum)

{{

//src—以平面或者填充模式输入视频资源//src—Input video resource in plane or fill mode

//cuWidth，cuHeight—当前CU的宽度和高度//cuWidth, cuHeight—the width and height of the current CU

/*maxColorNum—颜色表中允许的颜色的最大数量*//*maxColorNum—the maximum number of colors allowed in the color table*/

/*横向*//*horizontal*/

////

//memset(colorHist,0,(1<<bitDepth)*sizeof(UINT))//memset(colorHist,0,(1<<bitDepth)*sizeof(UINT))

pos＝0；pos=0;

cuSize＝cuWidth*cuHeight；cuSize = cuWidth * cuHeight;

当(pos<cuSize){When (pos < cuSize) {

colorHist[src[pos++]]++；colorHist[src[pos++]]++;

}}

/*只在colorHist[]中提取非零条目用于颜色值有序表*//* Only extract non-zero entries in colorHist[] for the ordered list of color values */

j＝0；j = 0;

对于(i＝0；i<(1<<bitDepth)；i++)for (i=0; i<(1<<bitDepth); i++)

{{

如果(colorHist[i]！＝0)if (colorHist[i]!= 0)

colorTableIntensity[j++]＝colorHist[i]；colorTableIntensity[j++]=colorHist[i];

}}

colorNum＝j；colorNum = j;

/*quicksort for histgram*//*quicksort for histgram*/

colorTableHist＝quickSort(colorTableIntensity,colorNum)；colorTableHist=quickSort(colorTableIntensity,colorNum);

/*如果maxColorNum>＝colorNum，提取所有颜色*//*If maxColorNum>=colorNum, extract all colors*/

/*如果maxColorNum<colorNum，只提取maxColorNum个颜色用于colorTableHist。这种情况下，所有的像素都会找到最匹配的颜色、对应的索引以及残差引擎编码的差异(实际像素和其对应颜色)。*//*If maxColorNum < colorNum, only extract maxColorNum colors for colorTableHist. In this case, all pixels will find the best matching color, the corresponding index and the difference encoded by the residual engine (the actual pixel and its corresponding color). */

/*颜色表中颜色的最佳数量可以通过迭代R-D代价获取进行确定。*//*The optimal number of colors in the color table can be determined by iterating the R-D cost. */

}}

deriveIndexMap()deriveIndexMap()

{{

pos＝0；pos=0;

cuSize＝cuWidth*cuHeight；cuSize = cuWidth * cuHeight;

当(pos<cuSize)When (pos < cuSize)

{{

minErr＝MAX_UINT；minErr = MAX_UINT;

对于(i＝0；i<colorNum；i++)For (i=0; i<colorNum; i++)

{{

err＝abs(src[pos]–colorTable[i])；err=abs(src[pos]–colorTable[i]);

如果(err<minErr)if (err < minErr)

{{

minErr＝err；minErr = err;

idx＝i；idx=i;

}}

idxMap[pos]＝idx；idxMap[pos] = idx;

}}

调色板颜色表处理Palette color table processing

对于图1中的处理1，调色板颜色表处理涉及处理器12编码调色板颜色表的尺寸(即各个颜色的总数)以及每个颜色本身。编码调色板颜色表中的每个颜色消耗大部分比特。因此，重点在于颜色编码(或者编码调色板颜色表中的每个条目)。For process 1 in FIG1 , the palette color table processing involves the processor 12 encoding the size of the palette color table (i.e., the total number of colors) as well as each color itself. Encoding each color in the palette color table consumes the most bits. Therefore, the focus is on color encoding (or encoding each entry in the palette color table).

在调色板颜色表中编码颜色的最直接方法是利用脉冲编码调制(pulse codemodulation，PCM)风格算法，其中每个颜色都单独编码。或者，可以运用相继色的最近预测，然后可以编码预测增量而不是默认色值，这是DPCM(差分PCM)风格。两种方法在后续可以利用相等概率模式还是自适应上下文模式进行熵编码，取决于复杂度代价和编码效率之间的权衡。The most straightforward way to encode colors in a palette color table is to use a pulse code modulation (PCM)-style algorithm, where each color is encoded separately. Alternatively, the nearest prediction of successive colors can be used, and then the predicted delta can be encoded instead of the default color value, which is the DPCM (Differential PCM) style. Both methods can then be entropy-coded using either an equal probability mode or an adaptive context mode, depending on the trade-off between complexity and coding efficiency.

此处揭示了另一种称为相邻调色板颜色表合并的高级方案，其中定义color_table_merge_flag以指示当前CU是使用左侧CU还是上侧CU的调色板颜色表。如果没有，则当前CU会显式携带调色板颜色表信令。对于合并过程，另一个color_table_merge_direction指示合并方向来自上侧CU还是左侧CU。当然，候选可以不限于当前上侧CU或者左侧CU，如上左侧CU，上右侧CU等等。但是本发明中利用上侧CU和左侧CU来证明所述观点。无论哪种，将每个像素与现有调色板颜色表中的条目相比，且通过deriveIdxMap()，每个像素分配有产生最小预测差异(即像素减去调色板颜色表中最接近的颜色)的索引。对于预测差异是非零的情况，所有这些残差都利用HEVC RExt残差引擎进行编码。需要说明的是，是否使用合并处理可以取决于所述R-D代价。Another advanced scheme called adjacent palette color table merging is revealed here, in which color_table_merge_flag is defined to indicate whether the current CU uses the palette color table of the left CU or the upper CU. If not, the current CU will explicitly carry the palette color table signaling. For the merging process, another color_table_merge_direction indicates whether the merging direction is from the upper CU or the left CU. Of course, the candidates may not be limited to the current upper CU or the left CU, such as the upper left CU, the upper right CU, and so on. However, the upper CU and the left CU are used in the present invention to prove the point. In either case, each pixel is compared with the entry in the existing palette color table, and through deriveIdxMap(), each pixel is assigned the index that produces the minimum prediction difference (that is, the pixel minus the closest color in the palette color table). For the case where the prediction difference is non-zero, all these residuals are encoded using the HEVC RExt residual engine. It should be noted that whether the merging process is used may depend on the R-D cost.

生成在编码当前CU的合并过程中使用的相邻调色板颜色表的方法有多种。取决于其实现形式，其中一个需要同时更新编码器和解码器，其他只是编码器侧处理。There are several ways to generate the adjacent palette color table used in the merge process for encoding the current CU. Depending on their implementation, one of them requires updating both the encoder and decoder, while the others are encoder-side only.

同时更新所述编码器和解码器：在这种方法中，不管CU的深度和尺寸等等，相邻CU的调色板颜色表根据可用重构像素生成。对于每个CU，为尺寸相同深度相同的相邻CU检索所述重构(假设这种情况下颜色相似度较高)。例如，如图5所示，如果当前CU为16x16，深度为2，无论其相邻CU如何划分(例如左侧CU为8x8，深度为3；上侧CU为32x32，深度为1)，像素补偿(＝16)都会以当前CU为起点向左处理左侧16x16块并向上处理上侧16x16块。需要说明的是，所述编码器和解码器都应该保持此过程。Update the encoder and decoder simultaneously: In this method, regardless of the depth, size, etc. of the CU, the palette color table of the adjacent CU is generated based on the available reconstructed pixels. For each CU, the reconstruction is retrieved for the adjacent CU of the same size and depth (assuming that the color similarity is high in this case). For example, as shown in Figure 5, if the current CU is 16x16 and the depth is 2, no matter how its adjacent CU is divided (for example, the left CU is 8x8 and the depth is 3; the upper CU is 32x32 and the depth is 1), the pixel compensation (=16) will start from the current CU and process the left 16x16 block to the left and the upper 16x16 block upward. It should be noted that both the encoder and the decoder should maintain this process.

仅约束编码器的过程：对于这种方法，当前CU与其上侧CU和/或左侧CU尺寸和深度相同时，出现合并过程。后续操作中，利用可用的相邻CU的调色板颜色表获取当前CU的颜色索引图。例如，对于当前16x16CU，如果利用所述调色板颜色表和索引方法编码其相邻CU，即位于上侧或者左侧的CU，则当前CU直接利用其调色板颜色表获取所述R-D代价。将该合并代价与所述当前CU显式获取其调色板颜色表的情况(以及所述HEVC或者HEVC RExt中存在的其他传统模式)进行比较。选择产生的R-D代价较小的那个模式作为写入输出比特流的最终模式。可见只有编码器需要尝试/模仿不同的潜在模式。在解码器侧，所述color_table_merge_flag和color_table_merge_direction指示合并决策和合并方向，无需额外处理工作。Only the encoder's process is constrained: For this method, the merging process occurs when the current CU has the same size and depth as the CU above and/or the CU on the left. In subsequent operations, the color index map of the current CU is obtained using the palette color table of the available adjacent CU. For example, for the current 16x16 CU, if its adjacent CU, that is, the CU located above or to the left, is encoded using the palette color table and index method, the current CU directly uses its palette color table to obtain the R-D cost. This merge cost is compared with the case where the current CU explicitly obtains its palette color table (as well as other traditional modes present in the HEVC or HEVC RExt). The mode that produces the smaller R-D cost is selected as the final mode written to the output bitstream. It can be seen that only the encoder needs to try/emulate different potential modes. On the decoder side, the color_table_merge_flag and color_table_merge_direction indicate the merge decision and merge direction, and no additional processing is required.

颜色索引图处理Color index image processing

对于图1中的处理3，针对编码所述颜色索引图，已经研究出了一些解决方案，例如RUN模式，RUN和COPY_ABOVE，以及自适应相邻索引预测。在本发明中，揭示1D串匹配方法和其2D变化以进行索引图编码。在每个位置发现匹配点，并且为1D串匹配记录匹配距离和长度，或者为2D串匹配记录宽度/高度。对于不匹配位置，直接对其索引值或者索引值和预测索引值之间的增量值进行编码。For process 3 in Figure 1 , several solutions have been developed for encoding the color index map, such as RUN mode, RUN and COPY_ABOVE, and adaptive neighbor index prediction. In this invention, a 1D string matching method and its 2D variant are disclosed for index map encoding. At each location, a matching point is found, and the matching distance and length for 1D string matching, or the width/height for 2D string matching, are recorded. For non-matching locations, their index value or the delta between the index value and the predicted index value is encoded directly.

这里揭示的是利用颜色索引图进行直接1D搜索的方法。参见图6，从真实字幕内容中解析索引图。图7示出了1-D搜索之后的片段(即该索引图的开端)。Here we present a method for direct 1D search using a color index map. See Figure 6 for an index map parsed from real subtitle content. Figure 7 shows a fragment after a 1D search (i.e., the beginning of the index map).

在该1-D颜色索引矢量的头部应用串匹配。下文给出该1-D串匹配的例子。对于每个索引图的第一个位置，例如图7中所示的14，因为没有缓存参考，因此将该第一个索引视为“非匹配对”，其中将-1和1赋予其对应的距离和长度，记作(dist，len)＝(-1，1)。对于第二个索引，又是另一个“14”，这是第一个编码为参考的索引，因此dist＝1。因为在第三个位置是另一个“14”，长度为2，即len＝2，(假设每一个正在使用的索引都可以直接作为后一个索引的参考)。向前移动至第4个位置，遇到的是之前没有出现过的“17”。因此，再次编码为非匹配对，即(dist，len)＝(-1，1)。对于所述非匹配对，将所述标志进行编码(例如“dist＝＝-1”)，并且其后接该索引的真实值(如首次出现的“14”、“17”、“6”等等)。另一方面，对于所述匹配对，仍然将所述标志进行编码(例如“dist！＝-1”)，并且其后接该匹配串的长度。String matching is applied at the head of the 1-D color index vector. An example of this 1-D string matching is given below. For the first position of each index map, such as 14 shown in Figure 7, since there is no cached reference, the first index is considered a "non-matching pair", where -1 and 1 are assigned to their corresponding distance and length, recorded as (dist, len) = (-1, 1). For the second index, there is another "14", which is the first index encoded as a reference, so dist = 1. Because there is another "14" in the third position, the length is 2, that is, len = 2 (assuming that each index in use can be directly used as a reference for the next index). Moving forward to the 4th position, "17" is encountered, which has not appeared before. Therefore, it is encoded as a non-matching pair again, that is, (dist, len) = (-1, 1). For the non-matching pair, the flag is encoded (for example, "dist == -1"), followed by the actual value of the index (such as the first occurrence of "14", "17", "6", etc.). On the other hand, for the matching pair, the flag is still encoded (eg, "dist!=-1") and is followed by the length of the matching string.

这里是利用图7中示出的示例性索引的编码过程的总结：Here is a summary of the encoding process using the exemplary index shown in Figure 7:

dist＝-1，len＝1，idx＝14(不匹配)dist=-1, len=1, idx=14 (no match)

dist＝1，len＝2(匹配)dist=1, len=2 (match)

dist＝-1，len＝1，idx＝17(不匹配)dist=-1, len=1, idx=17 (mismatch)

dist＝1，len＝3(匹配)dist=1, len=3 (match)

dist＝-1，len＝1，idx＝6(不匹配)dist=-1, len=1, idx=6 (mismatch)

dist＝1，len＝25(匹配)dist=1, len=25 (match)

dist＝30，len＝4(匹配)/*因为“17”之前已经出现过*/dist=30, len=4 (match) /* because "17" has appeared before*/

….….

为该匹配对获取提供伪代码，即：The pseudo code for obtaining the matching pair is provided as follows:

Void deriveMatchedPairs(TComDataCU*pcCU,Pel*pIdx,Pel*pDist,Pel*pLen,UInt uiWidth,UInt uiHeight)Void deriveMatchedPairs(TComDataCU*pcCU,Pel*pIdx,Pel*pDist,Pel*pLen,UInt uiWidth,UInt uiHeight)

{{

//pIdx是限制在uiWidth*uiHeight范围内的idx CU//pIdx is limited to the range of uiWidth*uiHeight idx CU

UInt uiTotal＝uiWidth*uiHeight；UInt uiTotal=uiWidth*uiHeight;

UInt uiIdx＝0；UInt uiIdx = 0;

Int j＝0；Int j = 0;

Int len＝0；Int len = 0;

//如果没有左侧/上侧缓存，将第一像素编码为其本身//If there is no left/top buffer, encode the first pixel as itself

pDist[uiIdx]＝-1；pDist[uiIdx]=-1;

pLen[uiIdx]＝0；pLen[uiIdx]=0;

uiIdx++；uiIdx++;

当(uiIdx<uiTotal)When (uiIdx < uiTotal)

{{

len＝0；len=0;

dist＝-1；dist=-1;

对于(j＝uiIdx-1；j>＝0；j--)For (j=uiIdx-1; j>=0; j--)

{{

//如果找到匹配对，运用当前穷举搜索//If a match is found, use the current exhaustive search

//可以运用快速串搜索//You can use fast string search

如果(pIdx[j]＝＝pIdx[uiIdx])if (pIdx[j] == pIdx[uiIdx])

{{

对于(len＝0；len<(uiTotal-uiIdx)；len++)for (len=0; len<(uiTotal-uiIdx); len++)

{{

如果(pIdx[j+len]！＝pIdx[len+uiIdx])if (pIdx[j+len]!=pIdx[len+uiIdx])

跳出循环；Break out of the loop;

}}

如果(len>maxLen)/*最好随R-D决策进行改变*/If (len>maxLen)/*It is best to change with R-D decision*/

{{

maxLen＝len；maxLen=len;

dist＝(uiIdx-j)；dist = (uiIdx - j);

}}

pDist[uiIdx]＝dist；pDist[uiIdx] = dist;

pLen[uiIdx]＝maxLen；pLen[uiIdx] = maxLen;

uiIdx＝uiIdx+maxLen；uiIdx＝uiIdx+maxLen；

}}

在利用2D搜索变化时，执行以下步骤：When utilizing 2D search variations, the following steps are performed:

识别当前像素的位置和作为起点的参考像素的位置；Identifying the location of the current pixel and the location of the reference pixel as a starting point;

将水平1D串匹配运用至当前像素和参考像素的右侧方向，最大搜索长度受当前水平行的尾部制约，将最大搜索长度记录为right_width；Apply horizontal 1D string matching to the right of the current pixel and the reference pixel. The maximum search length is constrained by the end of the current horizontal line. Record the maximum search length as right_width.

将水平1D串匹配运用至当前像素和参考像素的左侧方向，最大搜索长度受当前水平行的开头制约，将最大搜索长度记录为left_width；Apply horizontal 1D string matching to the left of the current pixel and the reference pixel. The maximum search length is constrained by the beginning of the current horizontal line. Record the maximum search length as left_width.

将当前像素和参考像素下面的像素作为新的当前像素和参考像素，在下一行执行相同的1D串匹配；Use the pixels below the current pixel and reference pixel as the new current pixel and reference pixel, and perform the same 1D string matching on the next row;

直至right_width＝＝left_width＝＝0时停止；Stop when right_width == left_width == 0;

现在，对于每个高度[n]＝{1，2，3…}，都有对应的宽度[n]阵列{{left_width[1]，right_width[1]}、{left_width[2]，right_width[2]}、{left_width[3]，right_width[3]}…}；Now, for each height[n] = {1, 2, 3...}, there is a corresponding width[n] array {{left_width[1], right_width[1]}, {left_width[2], right_width[2]}, {left_width[3], right_width[3]}...};

为每个高度[n]定义新的min_width阵列{{lwidth[1]，rwidth[1]}、{lwidth[2]，rwidth[2]}、{lwidth[3]，rwidth[3]}…}，其中lwidth[n]＝min(left_width[1:n-1])，rwidth[n]＝min(right_width[1:n-1])；Define a new min_width array {{lwidth[1], rwidth[1]}, {lwidth[2], rwidth[2]}, {lwidth[3], rwidth[3]}...} for each height[n], where lwidth[n] = min(left_width[1:n-1]), rwidth[n] = min(right_width[1:n-1]);

再定义尺寸阵列{size[1]，size[2]，size[3]…}，其中尺寸[n]＝height[n]x(lwidth[n]+hwidth[n])；Then define the size array {size[1], size[2], size[3]...}, where size[n] = height[n] x (lwidth[n] + hwidth[n]);

假设尺寸[n]持有尺寸阵列中的最大值，则利用对应的{lwidth[n]，rwidth[n]，height[n]}选择2D串匹配的宽度和高度。Assuming size[n] holds the maximum value in the size array, the width and height of the 2D string match are selected using {lwidth[n], rwidth[n], height[n]} respectively.

优化所述1D或者2D搜索速度的一个方法是使用运行哈希。本发明中描述的是一个4像素的运行哈希结构。在水平方向为每个像素计算运行哈希以生成水平哈希阵列running_hash_h[]。在running_hash_h[]的头部计算另一个运行哈希，以生成2D哈希阵列running_hash_hv[]。该2D哈希阵列中的每个值匹配表示一个4x4块匹配。为了执行2D匹配，在逐个对比像素与其相邻像素之前，尽可能多的发现4x4块匹配。由于逐个对比像素受限于1至3个像素，所述搜索速度可能大幅增长。One way to optimize the speed of the 1D or 2D search is to use a running hash. Described in the present invention is a 4-pixel running hash structure. A running hash is calculated for each pixel in the horizontal direction to generate a horizontal hash array running_hash_h[]. Another running hash is calculated at the head of running_hash_h[] to generate a 2D hash array running_hash_hv[]. Each value match in the 2D hash array represents a 4x4 block match. To perform a 2D match, as many 4x4 block matches as possible are found before comparing the pixel to its neighbors one by one. Since the comparison of pixels one by one is limited to 1 to 3 pixels, the search speed can be greatly increased.

从上述描述中得出，每行的匹配宽度不同，因此必须分开处理每行。为了实现高效和低复杂度，揭示了一种基于块的算法，该算法在硬件实现和软件实现中都可以使用。与标准的运动估计非常相似，该算法一次处理一个长方形块。From the above description, it follows that the matching width of each row is different, so each row must be processed separately. To achieve high efficiency and low complexity, a block-based algorithm is developed that can be used in both hardware and software implementations. Much like standard motion estimation, the algorithm processes one rectangular block at a time.

以4x4块为例。如图8所示，此设计中的基本单元称为U_PIXEL。编码的信号是指示参考像素是否已经在之前的串匹配操作中进行了编码的标志。可选地，所述输入信号Cmp[n-1]可以被置为“0”，其允许从U_PIXEL模块中删除最后的“OR”门。Taking a 4x4 block as an example, as shown in Figure 8, the basic unit in this design is called U_PIXEL. The encoded signal is a flag indicating whether the reference pixel has been encoded in a previous string matching operation. Optionally, the input signal Cmp[n-1] can be set to "0", which allows the final "OR" gate to be removed from the U_PIXEL module.

第一步是平行处理每一行。长方形一行中的每个像素都分配到一个U_PIXEL块，这个处理单元称作U_ROW。图9中示出了第一行的处理单元的例子。The first step is to process each row in parallel. Each pixel in a row of the rectangle is assigned to a U_PIXEL block, and this processing unit is called U_ROW. Figure 9 shows an example of the processing unit for the first row.

如图10所示，需要4个U_ROW单元处理该4x4块。其输出是cmp[4][4]阵列。As shown in Figure 10, 4 U_ROW units are required to process the 4x4 block. Its output is a cmp[4][4] array.

下一个步骤是平行处理所述cmp阵列中的每一列。如图11所示，处理单元U_COL处理所述cmp阵列的一列中的每个cmp。The next step is to process each column of the cmp array in parallel. As shown in Figure 11, the processing unit U_COL processes each cmp in a column of the cmp array.

需要4个U_COL单元处理该4x4块。如图12所示，其输出是rw[4][4]阵列。Four U_COL units are required to process this 4x4 block. As shown in Figure 12, its output is an rw[4][4] array.

然后计数rw[n][0-3]的每行中0的数量，将4个结果记录成r_width[n]阵列。需要说明的是，在步骤#7中，r_width[n]与rwidth[n]相等。l_width[n]以相同形式生成。可以获得步骤#7中的min_width阵列：{{l_width[1]，r_width[1]}、{l_width[2]，r_width[2]}、{l_width[3]，r_width[3]}…}。Then count the number of zeros in each row of rw[n][0-3] and record the four results in the r_width[n] array. It should be noted that in step #7, r_width[n] is equal to rwidth[n]. l_width[n] is generated in the same way. The min_width array in step #7 can be obtained as: {{l_width[1], r_width[1]}, {l_width[2], r_width[2]}, {l_width[3], r_width[3]}...}.

可以修改该硬件结构以适合任意现代CPU/DSP/GPU的平行处理架构。快速软件实现的简化伪代码如下所列：This hardware structure can be modified to fit any modern CPU/DSP/GPU parallel processing architecture. The simplified pseudo code for a fast software implementation is listed below:

//1.生成C[][]阵列//1. Generate C[][] array

对于(y＝0；y<height；++y)For (y=0; y<height; ++y)

{{

对于(x＝0；x<width；++x)For (x=0; x<width; ++x)

{{

tmp1＝cur_pixel^ref_pixel；tmp1 = cur_pixel^ref_pixel;

tmp2＝tmp1[0]|tmp1[1]|tmp1[2]|tmp1[3]|tmp1[4]|tmp1[5]|tmp1[6]|tmp1[7]；tmp2＝tmp1[0]|tmp1[1]|tmp1[2]|tmp1[3]|tmp1[4]|tmp1[5]|tmp1[6]|tmp1[7];

C[y][x]＝tmp2&(！coded[y][x])；C[y][x]=tmp2&(!coded[y][x]);

}}

//2.生成CMP[][]阵列//2. Generate CMP[][] array

对于(y＝0；y<height；++y)For (y=0; y<height; ++y)

{{

CMP[y][0]＝C[y][0]；CMP[y][0]＝C[y][0]；

}}

对于(x＝1；x<width；++x)For (x=1; x<width; ++x)

{{

对于(y＝0；y<height；++y)For (y=0; y<height; ++y)

{{

CMP[y][x]＝C[y][x]|CMP[y][x-1]CMP[y][x]＝C[y][x]|CMP[y][x-1]

}}

//3.生成RW[][]或者LW[][]阵列//3. Generate RW[][] or LW[][] array

对于(x＝0；x<width；++x)For (x=0; x<width; ++x)

{{

RW[0][x]＝CMP[0][x]；RW[0][x]＝CMP[0][x]；

}}

对于(y＝1；y<height；++y)For (y=1; y<height; ++y)

{{

对于(x＝0；x<width；++x)For (x=0; x<width; ++x)

{{

RW[y][x]＝CMP[y][x]|RW[y-1][x]；RW[y][x]=CMP[y][x]|RW[y-1][x];

}}

//4.将RW[][]转换为R_WIDTH[]//4. Convert RW[][] to R_WIDTH[]

对于(y＝0；y<height；++y)For (y=0; y<height; ++y)

{{

//计数0或者前导零检测//Count 0 or leading zero detection

R_WIDTH[y]＝LZD(RW[y][0],RW[y][1],RW[y][2],RW[y][3])；R_WIDTH[y]=LZD(RW[y][0],RW[y][1],RW[y][2],RW[y][3]);

}}

每环中没有数据依赖性，因此可以运用循环展开和MMX/SSE等传统的软件平行处理方法来提高执行速度。There are no data dependencies within each loop, so traditional software parallel processing methods such as loop unrolling and MMX/SSE can be used to improve execution speed.

如果行的数量局限于1，则该方法也可以运用至1D搜索。基于固定长度的1D搜索的快速软件实现的简化伪代码如下所列：This method can also be applied to 1D search if the number of rows is limited to 1. The simplified pseudo code for a fast software implementation of a fixed-length 1D search is listed below:

//1.生成C[]阵列//1. Generate C[] array

对于(x＝0；x<width；++x)For (x=0; x<width; ++x)

{{

tmp1＝cur_pixel^ref_pixel；tmp1 = cur_pixel^ref_pixel;

C[x]＝tmp2&(！coded[x])；C[x] = tmp2&(!coded[x]);

}}

//2.生成RW[]或者LW[]阵列//2. Generate RW[] or LW[] array

如果(删除了U_PIXEL模块中最后一个“OR”运算)If (deleted the last "OR" operation in the U_PIXEL module)

指定RW[]＝C[]Specify RW[] = C[]

否则{otherwise{

RW[0]＝C[0]；RW[0]＝C[0]；

对于(x＝1；x<width；++x)For (x=1; x<width; ++x)

{{

RW[x]＝C[x]|RW[x-1]RW[x]＝C[x]|RW[x-1]

}}

]]

//3.将RW[][]转换为R_WIDTH[]//3. Convert RW[][] to R_WIDTH[]

//计数0或者前导零检测//Count 0 or leading zero detection

R_WIDTH＝LZD(RW[0],RW[1],RW[2],RW[3])；R_WIDTH=LZD(RW[0],RW[1],RW[2],RW[3]);

否则otherwise

R_WIDTH[y]＝COUNT_ZERO(RW[0],RW[1],RW[2],RW[3])；R_WIDTH[y]=COUNT_ZERO(RW[0],RW[1],RW[2],RW[3]);

在1D匹配和2D匹配都完成之后，将选择最大的(ld length，2d(width xheight))。如果2D匹配的lwidth不是0，则需要调整前一个1D匹配的长度(length＝length–lwidth)，以避免前一个1D匹配和当前2D匹配的重叠。如果调整之后前一个1D匹配的长度变成0，则将其从匹配列表中删除。After both 1D and 2D matches are completed, the largest (ld length, 2d (width x height)) is selected. If the lwidth of the 2D match is not 0, the length of the previous 1D match needs to be adjusted (length = length – lwidth) to avoid overlap between the previous 1D match and the current 2D match. If the length of the previous 1D match becomes 0 after the adjustment, it is deleted from the match list.

如果前一个匹配是1D匹配，则利用current_location+length计算下一个起始位置，或者如果前一个匹配是2D匹配，则利用current_location+(lwidth+rwidth)计算下一个起始位置。执行1D匹配时，如果任何将要匹配的像素属于之前任意2D匹配区域，其中像素位置已经被2D匹配覆盖，则扫描接下来的像素，直到找到没有在之前的匹配中被编码的像素的位置。If the previous match is a 1D match, the next starting position is calculated using current_location+length, or if the previous match is a 2D match, the next starting position is calculated using current_location+(lwidth+rwidth). When performing a 1D match, if any pixel to be matched belongs to any previous 2D match area, where the pixel position has been covered by a 2D match, the next pixels are scanned until a pixel position that has not been encoded in the previous match is found.

在获得这些匹配对之后，利用熵引擎将这些符号转换成二进制流。这里所示例的是使用相等概率上下文模式的观点。为了更好的压缩效率，也可以运用高级自适应上下文模式。After obtaining these matching pairs, the entropy engine is used to convert these symbols into a binary stream. The example here uses the concept of equal probability context mode. For better compression efficiency, advanced adaptive context mode can also be used.

//对于每个CU的循环，uiTotal＝uiWidth*uiHeight，uiIdx＝0；//For each CU loop, uiTotal = uiWidth * uiHeight, uiIdx = 0;

当(uiIdx<uiTotal){When (uiIdx < uiTotal) {

//*pDist：为每个匹配对存储距离值//*pDist: stores distance values for each matching pair

//*pIdx：为每个匹配对存储索引值//*pIdx: stores index values for each matching pair

//*pLen：为每个匹配对存储长度值//*pLen: Stores the length value for each matching pair

//encodeEP()和encodeEPs()都重复利用HEVC或者类似的旁路熵编码。//encodeEP() and encodeEPs() both reuse HEVC or similar bypass entropy coding.

如果(pDist[uiIdx]＝＝-1)if (pDist[uiIdx] == -1)

{{

利用相等概率模式编码每个二进制以指示Each binary is encoded with an equal probability pattern to indicate

//当前对是否匹配。//Whether the current pair matches.

unmatchedPairFlag＝TRUE；unmatchedPairFlag = TRUE;

encodeEP(unmatchedPairFlag)；encodeEP(unmatchedPairFlag);

//uiIndexBits由颜色表的尺寸控制//uiIndexBits is controlled by the size of the color table

//即：对于24种不同颜色，我们需要5比特，对于8种颜色，需要3比特// That is: for 24 different colors, we need 5 bits, for 8 colors, we need 3 bits

encodeEPs(pIdx[uiIdx],uiIndexBits)；encodeEPs(pIdx[uiIdx],uiIndexBits);

uiIdx++；uiIdx++;

}}

否则otherwise

{{

unmatchedPairFlag＝FALSE；unmatchedPairFlag = FALSE;

encodeEP(unmatchedPairFlag)；encodeEP(unmatchedPairFlag);

/*利用可能的最大值限制二值化*//*Use the maximum possible limit to binarize*/

UInt uiDistBits＝0；UInt uiDistBits = 0;

//补偿用于从相邻块增加附加参考//Compensation is used to add additional references from adjacent blocks

//这里，首先使补偿等于0；//Here, first make the compensation equal to 0;

当((1<<uiDistBits)<＝(uiIdx+offset))When ((1<<uiDistBits)<＝(uiIdx+offset))

{{

uiDistBits++；uiDistBits++;

}}

encodeEPs(pDist[uiIdx],uiDistBits)；encodeEPs(pDist[uiIdx],uiDistBits);

UInt uiLenBits＝0；UInt uiLenBits = 0;

当((1<<uiLenBits)<＝(uiTotal-uiIdx))When ((1<<uiLenBits)<=(uiTotal-uiIdx))

{{

uiLenBits++；uiLenBits++;

}}

encodeEPs(pLen[uiIdx],uiLenBits)；encodeEPs(pLen[uiIdx],uiLenBits);

uiIdx+＝pLen[uiIdx]；uiIdx+=pLen[uiIdx];

}}

所示是每个匹配对的编码过程。对应地，所述匹配对的解码过程如下：The following is the encoding process of each matching pair. Correspondingly, the decoding process of the matching pair is as follows:

当(uiIdx<uiTotal){When (uiIdx < uiTotal) {

//parseEP()和parseEPs()都重复利用HEVC或者类似的旁路熵编码。// Both parseEP() and parseEPs() reuse HEVC or similar bypass entropy coding.

//解析非匹配对标志//Parse non-matching pair flags

parseEP(&uiUnmatchedPairFlag)；parseEP(&uiUnmatchedPairFlag);

如果(uiUnmatchedPairFlag)if (uiUnmatchedPairFlag)

{{

parseEPs(uiSymbol,uiIndexBits)；parseEPs(uiSymbol,uiIndexBits);

pIdx[uiIdx]＝uiSymbol；pIdx[uiIdx]＝uiSymbol；

uiIdx++；uiIdx++;

}}

否则otherwise

{{

UInt uiDistBits＝0；UInt uiDistBits = 0;

当((1<<uiDistBits)<＝(uiIdx+offset))When ((1<<uiDistBits)<＝(uiIdx+offset))

uiDistBits++；uiDistBits++;

UInt uiLenBits＝0；UInt uiLenBits = 0;

当((1<<uiLenBits)<＝(uiTotal-uiIdx))When ((1<<uiLenBits)<=(uiTotal-uiIdx))

uiLenBits++；uiLenBits++;

parseEPs(uiSymbol,uiDistBits)；parseEPs(uiSymbol,uiDistBits);

pDist[uiIdx]＝uiSymbol；pDist[uiIdx] = uiSymbol;

parseEPs(uiSymbol,uiLenBits)；parseEPs(uiSymbol,uiLenBits);

pLen[uiIdx]＝uiSymbol；pLen[uiIdx]＝uiSymbol；

对于(UInt i＝0；i<pLen[uiIdx]；i++)for (UInt i=0; i<pLen[uiIdx]; i++)

pIdx[i+uiIdx]＝pIdx[i+uiIdx-pDist[uiIdx]]；pIdx[i+uiIdx]=pIdx[i+uiIdx-pDist[uiIdx]];

uiIdx+＝pLen[uiIdx]；uiIdx+=pLen[uiIdx];

}}

需要说明的是，只将非匹配位置中的像素编码成比特流。为了有更精确的统计模式，只将这些像素和其相邻像素用于调色板颜色表获取，而非该CU中的所有像素。It should be noted that only pixels in non-matching positions are encoded into the bitstream. In order to have a more accurate statistical model, only these pixels and their adjacent pixels are used for palette color table acquisition, rather than all pixels in the CU.

这些索引或者增量输出通常包括部分编码模式下有限数量的特殊值。本发明介绍了第二增量调色板表，以利用此项发现。该增量调色板表可以在该CU中获得所有文字数据之后建立，并且会在比特流中显式发送。可选地，该增量调色板表可以在编码过程中自适应建立，使得所述表不需要被包括在比特流中。为这种选择定义delta_color_table_adaptive_flag。These indexes or delta outputs typically include a limited number of special values for some coding modes. This invention introduces a second delta palette table to take advantage of this discovery. This delta palette table can be built after all text data is available in the CU and is explicitly sent in the bitstream. Alternatively, the delta palette table can be built adaptively during the encoding process so that the table does not need to be included in the bitstream. The delta_color_table_adaptive_flag is defined for this option.

提供另一种称为相邻增量调色板颜色表合并的高级方案。对于自适应增量调色板的生成，编码器可以利用上侧CU或者左侧CU的增量调色板作为初始起点。对于非自适应调色板的生成，所述编码器也可以利用上侧CU或者左侧CU的增量调色板，并且比较上侧CU、左侧CU以及当前CU之间的RD代价。Another advanced scheme called adjacent incremental palette color table merging is provided. For adaptive incremental palette generation, the encoder can use the incremental palette of the upper CU or the left CU as the initial starting point. For non-adaptive palette generation, the encoder can also use the incremental palette of the upper CU or the left CU and compare the RD costs between the upper CU, the left CU, and the current CU.

定义delta_color_table_merge_flag，以指示当前CU是使用左侧CU还是上侧CU的增量调色板颜色表。只有当delta_color_table_adaptive_flag＝＝0并且delta_color_table_merge_flag＝＝0时，当前CU才会显式携带增量调色板颜色表信令。Define delta_color_table_merge_flag to indicate whether the current CU uses the delta palette color table of the left CU or the upper CU. Only when delta_color_table_adaptive_flag==0 and delta_color_table_merge_flag==0, the current CU will explicitly carry the delta palette color table signaling.

对于合并过程，如果确定了delta_color_table_merge_flag，则定义另一个delta_color_table_merge_direction用于指示合并候选是来自上侧CU还是左侧CU。For the merging process, if delta_color_table_merge_flag is determined, another delta_color_table_merge_direction is defined to indicate whether the merge candidate is from the upper CU or the left CU.

自适应增量调色板生成的编码处理的示例如下所示。在解码侧，无论解码器何时接收到文字数据，基于逆向步骤重新生成增量调色板：An example of the encoding process for adaptive delta palette generation is shown below. On the decoding side, whenever the decoder receives text data, it regenerates the delta palette based on the reverse steps:

定义palette_table[]和palette_count[]；Define palette_table[] and palette_count[];

初始化palette_table(n)为n(n＝0…255)，可选地，可以将上侧CU或者左侧CU的palette_table[]作为初始值；Initialize palette_table(n) to n (n=0...255). Optionally, palette_table[] of the upper CU or the left CU can be used as the initial value.

初始化palette_count(n)为0(n＝0…255)，可选地，可以将上侧CU或者左侧CU的palette_count[]作为初始值；Initialize palette_count(n) to 0 (n=0...255). Optionally, palette_count[] of the upper CU or the left CU can be used as the initial value.

对于任何增量值c’：For any increment value c':

定位n，使得palette_table(n)＝＝delta c’；Position n so that palette_table(n) == delta c';

将n作为delta c’的新索引；Set n as the new index of delta c’;

++palette_count(n)；++palette_count(n);

排序palette_count[]，使得palette_count[]呈下降顺序；Sort palette_count[] so that palette_count[] is in descending order;

相应地排序palette_table[]；Sort palette_table[] accordingly;

返回至步骤1直到当前LCU中所有的delta c’都处理完。Return to step 1 until all delta c' in the current LCU are processed.

对于任何包括文本和图形的块，利用屏蔽标志将文本部分和图形部分分开。利用上述方法压缩所述文本部分，并且通过另一种压缩方法压缩所述图形部分。For any block including text and graphics, a shielding mark is used to separate the text portion from the graphics portion. The text portion is compressed using the above method, and the graphics portion is compressed using another compression method.

需要说明的是，因为所述屏蔽标志覆盖的任意像素的值已经被文本层无损编码，图形部分中的像素可以视为“无关紧要的像素”。压缩所述图形部分时，可以给无关紧要的像素分配任意值以获得最优压缩效率。It should be noted that because the values of any pixels covered by the shielding mark have been losslessly encoded by the text layer, the pixels in the graphic portion can be regarded as "insignificant pixels". When compressing the graphic portion, arbitrary values can be assigned to insignificant pixels to achieve optimal compression efficiency.

因为所述调色板颜色表获取可以处理有损部分，所述索引图必须无损压缩。这允许利用1D或者2D串匹配进行高效处理。在本发明中，所述1D或者2D串匹配在当前LCU受到约束，但是搜索窗口可以延伸到当前LCU之外。还需要说明的是，可以利用水平方向和垂直方向中的一对运动矢量来编码所述匹配距离，即(MVy＝matched_distance/cuWidth，MVy＝matched_distance-cuWidth*MVy)。Because the palette color table acquisition can handle lossy parts, the index map must be losslessly compressed. This allows efficient processing using 1D or 2D string matching. In the present invention, the 1D or 2D string matching is constrained to the current LCU, but the search window can extend beyond the current LCU. It should also be noted that the matching distance can be encoded using a pair of motion vectors in the horizontal and vertical directions, i.e. (MVy=matched_distance/cuWidth, MVy=matched_distance-cuWidth*MVy).

假定图像在本地区域会有不同的空间结构定位，通过定义color_idx_map_pred_direction指示，可以允许水平或者垂直方向的1D搜索。可以基于所述R-D代价确定最优索引扫描方向。图6示出了从第一个位置开始的扫描方向。图9中还示出了水平和垂直扫描模式。以8x8的CU为例。对于水平和垂直扫描模式，所述deriveMatchPairs()和相关熵编码步骤都执行两次。然后根据最小的RD代价选择最后扫描方向。Assuming that images may have different spatial structures in local areas, the color_idx_map_pred_direction directive is defined to allow for 1D searches in either the horizontal or vertical direction. The optimal index scan direction can be determined based on the R-D cost. Figure 6 shows the scan direction starting from the first position. Figure 9 also illustrates the horizontal and vertical scan modes. An 8x8 CU is used as an example. For both horizontal and vertical scan modes, the deriveMatchPairs() and related entropy coding steps are performed twice. The final scan direction is then selected based on the minimum RD cost.

改良二值化Improved Binarization

如图13所示，编码所述调色板颜色表和颜色索引图的匹配信息对。利用固定长度的二值化进行该编码。可选地，可以利用长度可变的二值化。As shown in FIG13 , the matching information pair of the palette color table and the color index map is encoded. This encoding is performed using fixed-length binarization. Alternatively, variable-length binarization can be used.

例如，针对调色板颜色表编码，所述表格可以有8个不同的颜色值。因此，所述颜色索引图中只包含8个不同的索引。可以只利用1个比特例如0来表示背景像素，而不是利用固定的3个二进制数对每个索引值进行均等编码。然后剩余的7个像素值利用固定长度的码字如1000、1001、1010、1011、1100、1101和1110来编码所述颜色索引。这是基于所述背景颜色可能占据最大百分比的事实，所以对此有特殊码字保存总的二进制数。这种情形对于屏幕内容经常发生。以16x16CU为例，固定的3个二进制数的二值化需要3x16x16＝768个二进制数。另外，让0索引作为占据40％的背景颜色，其他颜色平均分配。这种情况下只需要2.8x16x16<768个二进制数。For example, for palette color table encoding, the table can have 8 different color values. Therefore, the color index map only contains 8 different indices. Instead of using fixed 3 binary digits to equally encode each index value, only 1 bit, such as 0, can be used to represent the background pixel. The remaining 7 pixel values are then encoded using fixed-length codewords such as 1000, 1001, 1010, 1011, 1100, 1101, and 1110 to encode the color index. This is based on the fact that the background color may occupy the largest percentage, so there are special codewords for this purpose to store the total number of binary digits. This situation often occurs for screen content. Taking a 16x16 CU as an example, binarization of the fixed 3 binary digits requires 3x16x16=768 binary digits. Alternatively, let the 0 index be the background color occupying 40%, and the other colors be evenly distributed. In this case, only 2.8x16x16<768 binary digits are required.

对于匹配对编码，假设在当前CU的区域内进行该方法的约束实现，最大值可以用于限制二值化。数学上，在每种情况下，匹配距离和长度可以长达64x64＝4K。但是这不会一起发生。对于每个匹配位置，所述匹配距离受当前位置和所述参考缓存中的第一位置(例如当前CU中的第一位置)之间的距离如L限制。因此，对于这个距离二值化的最大二进制数是log2(L)+1(而不是固定长度)，对于长度二值化的最大二进制数是log2(cuSize-L)+1，其中cuSize＝cuWidth*cuHeight。For matching pair encoding, assuming a constrained implementation of the method within the region of the current CU, a maximum value can be used to limit the binarization. Mathematically, in each case, the matching distance and length can be as long as 64x64=4K. But this does not happen together. For each matching position, the matching distance is limited by the distance between the current position and the first position in the reference cache (e.g., the first position in the current CU), such as L. Therefore, the maximum number of bins binarized for this distance is log2(L)+1 (instead of a fixed length), and the maximum number of bins binarized for the length is log2(cuSize-L)+1, where cuSize=cuWidth*cuHeight.

除了所述调色板颜色表和索引图之外，通过不同的二值化方法可以显著改善残差编码。对于HEVC Rex和HEVC版本，假设预测、转换和量化之后残差波动应该很小，转换系数是使用长度可变编码进行的二值化。但是在介绍完转换跳跃之后，尤其对于根据不同颜色在屏幕内容上的转换跳跃，通常会存在有较大随机值(不接近于“1”、“2”、“0”的较小值)的残差。如果利用当前HEVC系数的二值化，则会产生非常长的码字。可选地，利用固定长度的二值化为所述调色板颜色表和索引编码模式产生的残差保存编码长度。In addition to the palette color table and index map, the residual coding can be significantly improved by different binarization methods. For HEVC Rex and HEVC versions, it is assumed that the residual fluctuations after prediction, conversion and quantization should be small, and the conversion coefficients are binarized using variable-length coding. However, after the introduction of conversion jumps, especially for conversion jumps based on different colors on the screen content, there will usually be residuals with large random values (smaller values that are not close to "1", "2", "0"). If the current HEVC coefficient binarization is used, very long codewords will be generated. Optionally, a fixed-length binarization is used to save the coding length for the residuals generated by the palette color table and index coding mode.

混合内容的自适应色度采样Adaptive Chroma Sampling for Mixed Content

上面提供了针对HEVC/HEVC-RExt架构下高效屏幕内容编码的各种技术。实际中，除了纯粹的屏幕内容(比如文本和图形)或者纯粹的自然视频，还有包括屏幕材料和相机捕捉的自然视频的内容，称之为混合内容。目前，利用4:4:4色度采样来处理混合内容。但是对于此类混合内容中的嵌入式相机捕捉的自然视频部分，所述4:2:0色度采样就足以提供无损感知质量。这是因为相比于亮度成分的空间变化，人类视觉系统对色度成分的空间变化不是很敏感。因此，通常对色度部分进行子采样(例如，普通的4:2:0视频格式)，以在保存相同重构质量的同时实现比特率的明显降低。Various techniques for efficient screen content encoding under the HEVC/HEVC-RExt architecture are provided above. In practice, in addition to pure screen content (such as text and graphics) or pure natural video, there is also content that includes screen material and natural video captured by a camera, which is called mixed content. Currently, 4:4:4 chroma sampling is used to process mixed content. However, for the natural video portion captured by the embedded camera in such mixed content, the 4:2:0 chroma sampling is sufficient to provide lossless perceptual quality. This is because the human visual system is not very sensitive to spatial variations of the chroma component compared to spatial variations of the luminance component. Therefore, the chroma portion is usually subsampled (for example, the ordinary 4:2:0 video format) to achieve a significant reduction in bit rate while preserving the same reconstruction quality.

本发明提供一种在CU级别定义且递归发送的新标志(即enable_chroma_subsampling)。对于每个CU，所述编码器根据率失真代价确定其是利用4:2:0还是4:4:4进行编码。The present invention provides a new flag (ie, enable_chroma_subsampling) that is defined at the CU level and sent recursively. For each CU, the encoder determines whether to encode it using 4:2:0 or 4:4:4 based on the rate-distortion cost.

如图14A和图14B所示，图14A和图14B是4:2:0和4:4:4色度采样格式。As shown in FIG. 14A and FIG. 14B , FIG. 14A and FIG. 14B are 4:2:0 and 4:4:4 chroma sampling formats.

在编码器侧，对于每个CU，假设输入的是上面显示的4:4:4源，则率失真代价直接利用4:4:4编码程序获取，其中enable_chroma_subsampling＝0或者FALSE。然后过程子采样4:4:4从4:2:0采样以获取其比特消耗。所述重构4:2:0格式插回所述4:4:4格式用于失真测量(利用SSE/SAD)。在以4:2:0空间编码所述CU时，与比特消耗一起获取的还有失真率代价，将该失真率代价和以4:4:4编码所述CU时的代价进行比较。哪种编码的失真率代价较小，则选择此种编码作为最终编码。On the encoder side, for each CU, assuming the input is a 4:4:4 source as shown above, the rate-distortion cost is obtained directly using the 4:4:4 encoding process, where enable_chroma_subsampling = 0 or FALSE. The process then subsamples 4:4:4 from the 4:2:0 sample to obtain its bit cost. The reconstructed 4:2:0 format is inserted back into the 4:4:4 format for distortion measurement (using SSE/SAD). When the CU is spatially encoded in 4:2:0, the distortion rate cost is obtained along with the bit cost, and this distortion rate cost is compared with the cost when the CU is encoded in 4:4:4. Whichever encoding has the lower distortion rate cost is selected as the final encoding.

如图15所示，图15是从4:4:4到4:2:0的插值过程和从4:2:0到4:4:4的插值过程。通常，这种视频颜色采样格式转换过程需要大量插值滤波器。As shown in Figure 15, Figure 15 shows the interpolation process from 4:4:4 to 4:2:0 and the interpolation process from 4:2:0 to 4:4:4. Generally, such a video color sampling format conversion process requires a large number of interpolation filters.

为了降低实现的复杂度，可以利用HEVC插值滤波器(即DCT-IF)。如图15所示，“正方形框”表示原始的4:4:4样本。从4:4:4到4:2:0，利用DCT-IF为色度成分垂直插入半像素(“圆形”)。为便于说明，还示出了四分之一像素位置(“菱形”)。提取灰色阴影“圆形”组成4:2:0样本。对于从4:2:0到4:4:4的插值，所述过程起始于色度成分中的灰色“圆形”，水平插入半像素位置以获得所有“圆形”，然后利用DCT-IF垂直插入所述“正方形框”。选择所有插入的“正方形框”以组成重构4:4:4源。To reduce the complexity of the implementation, the HEVC interpolation filter (i.e., DCT-IF) can be used. As shown in Figure 15, the "square boxes" represent the original 4:4:4 samples. From 4:4:4 to 4:2:0, DCT-IF is used to vertically insert half pixels ("circles") for the chroma components. For ease of illustration, quarter-pixel positions ("diamonds") are also shown. Gray shaded "circles" are extracted to form 4:2:0 samples. For interpolation from 4:2:0 to 4:4:4, the process starts with the gray "circles" in the chroma components, horizontally inserts half-pixel positions to obtain all "circles", and then uses DCT-IF to vertically insert the "square boxes". All inserted "square boxes" are selected to form the reconstructed 4:4:4 source.

编码器控制Encoder control

如前面部分所讨论，揭示了一些标志以控制低级处理。例如，enable_packed_component_flag用于指示当前CU是利用填充形式还是传统的平面形式编码所述过程。是否启用填充模式可以取决于在编码器计算的所述R-D代价。对于实际编码器的实现，如图3所示，通过分析所述CU的柱状图并且找到决策的最佳阈值，实现低复杂度解决方案。As discussed in the previous section, several flags are exposed to control low-level processing. For example, the enable_packed_component_flag is used to indicate whether the current CU is encoded using padded or traditional planar encoding. Whether padded mode is enabled can depend on the R-D cost calculated at the encoder. For a practical encoder implementation, as shown in Figure 3, a low-complexity solution is achieved by analyzing the CU's histogram and finding the optimal threshold for decision making.

所述调色板颜色表的尺寸对复杂度有直接影响。引入maxColorNum以控制复杂度和编码效率之间的平衡。最直接的方法是选择产生最低R-D代价的那个。The size of the palette color table has a direct impact on complexity. maxColorNum is introduced to control the balance between complexity and coding efficiency. The most direct approach is to choose the one that produces the lowest R-D cost.

索引图编码方向可以由R-D优化或者利用本地空间定位(例如基于Sobel算子的方向估测)确定。The index image encoding direction can be determined by R-D optimization or using local spatial positioning (such as direction estimation based on the Sobel operator).

本发明限制每个CTU/CU内部的处理。实际中可以放松此项约束。例如，对于颜色索引图处理，可以利用上侧CU和左侧CU的行缓存，如图16所示。利用上侧和左侧缓存，可以扩展搜索以进一步提高编码效率。假设上侧/左侧缓存是利用相邻CU的重构像素组成，在处理当前CU索引图之前可以参考这些像素(以及其对应索引)。例如，重新排列之后，所述当前CU索引图可以是14、14、14、…1、2、1(如1D展示)。没有行缓存参考，第一个“14”就会被编码成非匹配对。但是有了相邻行缓存，所述串匹配可以从第一个像素开始，如下所示(水平和垂直扫描模式均示出)。The present invention limits the processing within each CTU/CU. In practice, this constraint can be relaxed. For example, for color index map processing, the row caches of the upper CU and the left CU can be utilized, as shown in Figure 16. By utilizing the upper and left caches, the search can be extended to further improve coding efficiency. Assuming that the upper/left caches are composed of reconstructed pixels of adjacent CUs, these pixels (and their corresponding indices) can be referenced before processing the current CU index map. For example, after rearrangement, the current CU index map can be 14, 14, 14, ... 1, 2, 1 (as shown in 1D). Without the row cache reference, the first "14" will be encoded as a non-matching pair. But with the adjacent row caches, the string matching can start from the first pixel, as shown below (both horizontal and vertical scan modes are shown).

解码器语法Decoder Syntax

下面信息可以用于描述图2所示的解码器。本发明的语法符合HEVC RExt的委员会草案。The following information can be used to describe the decoder shown in Figure 2. The syntax of the present invention complies with the committee draft of HEVC RExt.

7.3.5.8编码单元语法：7.3.5.8 Code unit syntax:

图17示出了应用于当前HEVC的装置和方法/流程。FIG17 shows an apparatus and method/process applied to current HEVC.

上述识别方法/流程和设备可以应用到无线、有线或者两者结合的通信网络，并且在如下描述的设备和如下附图中实现。The above identification method/process and device can be applied to wireless, wired or a combination of wireless and wired communication networks, and implemented in the devices described below and the following figures.

图18示出了根据本发明利用信令支持高级无线接收器的通信系统100的示例。一般来说，所述系统100能够让多个无线用户传输和接收数据及其他内容。所述系统100可以实现一个或者多个信道接入方法，例如码分多址接入(code division multiple access，CDMA)、时分多址(time division multiple access，TDMA)、频分多址(frequencydivision multiple access，FDMA)、正交FDMA(orthogonal frequency divisionmultiple access，OFDMA)或者单载波FDMA(single-carrier frequency divisionmultiple access，SC-FDMA)。FIG18 illustrates an example of a communication system 100 utilizing signaling to support advanced wireless receivers in accordance with the present invention. Generally, the system 100 enables multiple wireless users to transmit and receive data and other content. The system 100 can implement one or more channel access methods, such as code division multiple access (CDMA), time division multiple access (TDMA), frequency division multiple access (FDMA), orthogonal frequency division multiple access (OFDMA), or single-carrier frequency division multiple access (SC-FDMA).

在这个示例中，所述通信系统100包括用户设备(user equipment，UE)110a-110c、无线接入网(radio access network，RAN)120a-120b、核心网130、公共交换电话网络(public switched telephone network，PSTN)140、因特网150和其他网络160。尽管图18中示出了特定数量的组件或者单元，但是所述系统100中可以包括任意数量的这些组件或者单元。In this example, the communication system 100 includes user equipment (UE) 110a-110c, radio access networks (RAN) 120a-120b, a core network 130, a public switched telephone network (PSTN) 140, the Internet 150, and other networks 160. Although a specific number of components or units are shown in FIG18 , any number of these components or units may be included in the system 100.

所述UE 110a-110c用于在所述系统100中运行和/或通信。例如，所述UE 110a-110c用于传输和/或接受无线信号或者有线信号。每个UE 110a-110c代表任意合适的终端用户设备，可以包括如下设备(或者可以称为)：用户设备(user equipment/device，UE)、无线传输/接受单元(wireless transmit/receive unit，WTRU)、移动台、固定或者移动用户单元、寻呼机、蜂窝电话、个人数字助理(personal digital assistant，PDA)、智能手机、笔记本电脑、电脑、触摸板、无线传感器或者消费者电子设备。The UEs 110a-110c are configured to operate and/or communicate within the system 100. For example, the UEs 110a-110c are configured to transmit and/or receive wireless signals or wired signals. Each UE 110a-110c represents any suitable end-user device, and may include (or be referred to as) the following: user equipment/device (UE), wireless transmit/receive unit (WTRU), mobile station, fixed or mobile subscriber unit, pager, cellular phone, personal digital assistant (PDA), smartphone, laptop, computer, touchpad, wireless sensor, or consumer electronic device.

这里所述的RAN 120a-120b分别包括基站170a-170b。每个基站170a-170b用于与一个或者多个UE 110a-110c无线连接，以使其能够进入所述核心网130、所述PSTN 140、所述因特网150和/或所述其他网络160。例如，所述基站170a-170b可以包括(或者是)一个或者多个熟知的设备，如基站收发信台(base transceiver station，BTS)、Node-B(NodeB)、演进型基站(evolved NodeB，eNodeB)、家庭基站、家庭演进型基站、站点控制器、接入点(access point，AP)、无线路由器、服务器、路由器、交换器或者其他有有线或者无线网络的处理实体。The RANs 120a-120b described herein include base stations 170a-170b, respectively. Each base station 170a-170b is configured to wirelessly connect to one or more UEs 110a-110c to enable the UEs to access the core network 130, the PSTN 140, the Internet 150, and/or the other networks 160. For example, the base stations 170a-170b may include (or be) one or more well-known devices, such as a base transceiver station (BTS), a Node-B (NodeB), an evolved NodeB (eNodeB), a home base station, a home evolved base station, a site controller, an access point (AP), a wireless router, a server, a router, a switch, or other processing entities in a wired or wireless network.

在图18中示出的实施例中，所述基站170a组成所述RAN 120a的一部分，所述RAN120a可以包括其他基站、单元和/或设备。另外，所述基站170b组成所述RAN 120b的一部分，所述RAN 120b可以包括其他基站、单元和/或设备。每个基站170a-170b在特定地理范围或区域中运行以传输和/或接收无线信号，其中，该特定地理区域有时也称作“小区”。在部分实施例中，使用多个收发器的多入多出(multiple-input multiple-output，MIMO)技术可以运用至每个小区。In the embodiment shown in FIG18 , the base station 170 a forms part of the RAN 120 a, which may include other base stations, units, and/or devices. Additionally, the base station 170 b forms part of the RAN 120 b, which may include other base stations, units, and/or devices. Each base station 170 a-170 b operates within a specific geographic area or region to transmit and/or receive wireless signals, where the specific geographic region is sometimes referred to as a “cell.” In some embodiments, multiple-input multiple-output (MIMO) technology using multiple transceivers may be applied to each cell.

所述基站170a-170b利用无线通信链路通过一个或者多个空口190与一个或者多个UE 110a-110c通信。所述空口190可以利用任意合适的无线接入技术。The base stations 170a-170b communicate with one or more UEs 110a-110c using wireless communication links over one or more air interfaces 190. The air interfaces 190 may utilize any suitable radio access technology.

预计所述系统100可以利用多个信道接入功能，包括上面描述的方案。在特定实施例中，所述基站和UE实现LTE、LTE-A和/或LTE-B。当然可以利用其他多种接入方案和无线协议。It is contemplated that the system 100 may utilize multiple channel access capabilities, including the schemes described above. In certain embodiments, the base station and UE implement LTE, LTE-A, and/or LTE-B. Of course, various other access schemes and wireless protocols may be utilized.

所述RAN 120a-120b与所述核心网130通信以向所述UE 110a-110c提供语音、数据、应用、基于IP的语音传输(Voice over Internet Protocol，VoIP)或者其他服务。可以理解的是，所述RAN 120a-120b和/或所述核心网130可以与一个或者多个其他RAN(没有示出)直接或者间接通信。所述核心网130也可以作为网关以接入其他网络(例如PSTN 140、因特网150和其他网络160)。另外，部分或者所有UE 110a-110c可以包括利用不同的无线技术和/或协议通过不同的无线链路与不同的无线网络通信的功能。The RANs 120a-120b communicate with the core network 130 to provide voice, data, applications, Voice over Internet Protocol (VoIP), or other services to the UEs 110a-110c. It will be appreciated that the RANs 120a-120b and/or the core network 130 may communicate directly or indirectly with one or more other RANs (not shown). The core network 130 may also serve as a gateway to access other networks (e.g., PSTN 140, Internet 150, and other networks 160). Furthermore, some or all of the UEs 110a-110c may include functionality to communicate with different wireless networks over different wireless links using different wireless technologies and/or protocols.

尽管图18示出了通信系统的一个示例，但是图18可以有多种变化。例如，所述通信系统100可以包括任意数量的UE、基站、网络或者其他合适配置中的组件，还可以包括这里任意附图中示出的EPC。Although Figure 18 shows an example of a communication system, there are many variations of Figure 18. For example, the communication system 100 may include any number of UEs, base stations, networks, or other components in a suitable configuration, and may also include the EPC shown in any of the figures herein.

图19A和19B示出了本发明提供的可以执行所述方法和理念的示例性设备。具体地，图19A示出了示例性UE 110，图19B示出了示例性基站170。这些组件可以用于所述系统100或者任意其他合适的系统中。Figures 19A and 19B illustrate exemplary devices that can implement the methods and concepts provided by the present invention. Specifically, Figure 19A illustrates an exemplary UE 110, and Figure 19B illustrates an exemplary base station 170. These components can be used in the system 100 or any other suitable system.

如图19A所示，所述UE 110包括至少一个处理单元200。所述处理单元200执行所述UE 110的各种处理操作。例如，所述处理单元200可以执行信号编码、数据处理、功率控制、输入/输出处理、或者其他任何使所述UE 110在所述系统100中运行的功能。所述处理单元200也支持上面细述的方法和理念。每个处理单元200包括用于执行一个或者多个操作的任意合适的处理或者计算设备。例如，每个处理单元200可以包括微处理器、微控制器、数字信号处理器、现场可编程门阵列或者专用集成电路。As shown in FIG19A , the UE 110 includes at least one processing unit 200. The processing unit 200 performs various processing operations for the UE 110. For example, the processing unit 200 may perform signal encoding, data processing, power control, input/output processing, or any other function that enables the UE 110 to operate in the system 100. The processing unit 200 also supports the methods and concepts detailed above. Each processing unit 200 includes any suitable processing or computing device for performing one or more operations. For example, each processing unit 200 may include a microprocessor, a microcontroller, a digital signal processor, a field programmable gate array, or an application-specific integrated circuit.

所述UE 110也包括至少一个收发器202。所述收发器202用于调制至少一个天线204传输的数据或者其他内容。所述收发器202也用于解调所述至少一个天线204接收的数据或者其他内容。每个收发器202包括用于生成无线传输信号和/或处理无线接收信号的任意合适结构。每个天线204包括传输和/或接收无线信号的任意合适的结构。一个或者多个收发器202可以用于所述UE 110中，并且一个或者多个天线204可以用于所述UE 110中。尽管示为单个功能单元，收发器202也可以利用至少一个发送器以及至少一个单独的接收器实现。The UE 110 also includes at least one transceiver 202. The transceiver 202 is configured to modulate data or other content for transmission by at least one antenna 204. The transceiver 202 is also configured to demodulate data or other content received by the at least one antenna 204. Each transceiver 202 includes any suitable structure for generating wireless transmission signals and/or processing wireless reception signals. Each antenna 204 includes any suitable structure for transmitting and/or receiving wireless signals. One or more transceivers 202 may be used in the UE 110, and one or more antennas 204 may be used in the UE 110. Although shown as a single functional unit, the transceiver 202 may also be implemented using at least one transmitter and at least one separate receiver.

所述UE 110还包括一个或者多个输入/输出设备206。所述输入/输出设备206促进与用户交互。每个输入/输出设备206包括给用户提供信息或者从用户接受信息的任意合适的结构，比如扬声器、麦克风、小键盘、键盘、显示屏或者触摸屏等。The UE 110 also includes one or more input/output devices 206. The input/output devices 206 facilitate interaction with the user. Each input/output device 206 includes any suitable structure for providing information to the user or receiving information from the user, such as a speaker, microphone, keypad, keyboard, display screen, or touch screen.

另外，所述UE 110包括至少一个存储器208。所述存储器208存储所述UE 110使用、生成或者收集的指令和数据。例如，所述存储器208可以存储所述处理单元200执行的软件或固件指令和用于减少或消除入向信号之间干扰的数据。每个存储器208包括任意合适的易失性和/或非易失性存储和检索设备。可以利用任意合适形式的存储器，比如随机存取存储器(random access memory，RAM)、只读存储器(read only memory，ROM)、硬盘、光盘、用户识别(subscriber identity module，SIM)卡、记忆棒、安全数码(secure digital，SD)存储卡等等。In addition, the UE 110 includes at least one memory 208. The memory 208 stores instructions and data used, generated, or collected by the UE 110. For example, the memory 208 may store software or firmware instructions executed by the processing unit 200 and data used to reduce or eliminate interference between incoming signals. Each memory 208 includes any suitable volatile and/or non-volatile storage and retrieval device. Any suitable form of memory may be used, such as random access memory (RAM), read-only memory (ROM), a hard disk, an optical disk, a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like.

如图19B所示，所述基站170包括至少一个处理单元250、至少一个发送器252、至少一个接收器254、一个或者多个天线256和至少一个存储器258。所述处理单元250执行所述基站170的各种处理操作，比如信号编码、数据处理、功率控制、输入/输出处理或者其他任意功能。所述处理单元250也可以支持上面细述的方法和理念。每个处理单元250包括用于执行一个或者多个操作的任意合适的处理或者计算设备。例如，每个处理单元250可以包括微处理器、微控制器、数字信号处理器、现场可编程门阵列或者专用集成电路。As shown in FIG19B , the base station 170 includes at least one processing unit 250, at least one transmitter 252, at least one receiver 254, one or more antennas 256, and at least one memory 258. The processing unit 250 performs various processing operations of the base station 170, such as signal encoding, data processing, power control, input/output processing, or any other functions. The processing unit 250 may also support the methods and concepts detailed above. Each processing unit 250 includes any suitable processing or computing device for performing one or more operations. For example, each processing unit 250 may include a microprocessor, a microcontroller, a digital signal processor, a field programmable gate array, or an application-specific integrated circuit.

每个发送器252包括用于生成无线传输至一个或者多个UE或者其他设备的信号的任意合适结构。每个接收器254包括用于处理从一个或者多个UE或者其他设备无线接收的信号的任意合适结构。尽管示为单独的组件，但是至少一个发送器252和至少一个接收器254可以组成收发器。每个天线256包括传输和/或接收无线信号的任意合适结构。这里所示的常见天线256与所述发送器252和所述接收器254耦合，一个或者多个天线256可以与所述发送器252耦合，一个或者多个单独的天线256可以与所述接收器254耦合。每个存储器258包括任意合适的易失性和/或非易失性存储和检索设备。Each transmitter 252 includes any suitable structure for generating signals for wireless transmission to one or more UEs or other devices. Each receiver 254 includes any suitable structure for processing signals wirelessly received from one or more UEs or other devices. Although shown as separate components, at least one transmitter 252 and at least one receiver 254 can constitute a transceiver. Each antenna 256 includes any suitable structure for transmitting and/or receiving wireless signals. The common antenna 256 shown here is coupled to the transmitter 252 and the receiver 254, one or more antennas 256 can be coupled to the transmitter 252, and one or more separate antennas 256 can be coupled to the receiver 254. Each memory 258 includes any suitable volatile and/or non-volatile storage and retrieval device.

关于UE 110和基站170的其他细节对于现有技术人员都是已知的。因此这里就不再详细阐述。Other details about UE 110 and base station 170 are known to those skilled in the art and are therefore not described in detail here.

为本专利文档中使用的特定术语和短语进行定义是有帮助的。术语“包括”和“包含”以及它们的派生词表示没有限制的包括。术语“或者”是包容性的，意为和/或。短语“与……关联”和“与其关联”以及其派生的短语意味着包括，被包括在内、与……互连、包含、被包含在内、连接到或与……连接、耦合到或与……耦合、可与……通信、与……配合、交织、并列、接近、被绑定到或与……绑定、具有、具有……属性，等等。It is helpful to define certain terms and phrases used in this patent document. The terms "include" and "comprising," and their derivatives, mean inclusion without limitation. The term "or" is inclusive, meaning and/or. The phrases "associated with," and "associated therewith," and their derivatives, mean to include, be included within, be interconnected with, include, be contained within, be connected to or connected with, be coupled to or coupled with, be communicable with, cooperate with, intertwine, be juxtaposed with, be proximate to, be bound to or bound with, have, have the property of, and the like.

虽然本发明就某些实施例和一般相关方法方面进行了描述，但是对本领域技术人员而言，对实施例和方法的各种更改和变更将是显而易见的。因此，示例实施例的上述描述不限定或约束本发明。正如以下权利要求定义，其它修改、替代以及变更也是可能的，而不偏离本发明的精神和范围。Although the present invention has been described in terms of certain embodiments and generally related methods, various modifications and variations to the embodiments and methods will be apparent to those skilled in the art. Therefore, the foregoing description of the exemplary embodiments does not limit or restrict the present invention. As defined in the following claims, other modifications, substitutions, and variations are possible without departing from the spirit and scope of the present invention.

Claims

1. A method for encoding screen content into a bitstream, characterized in that the method comprises:

Create a palette color table for the first coding unit (CU) of the screen content;

A first color index map with an index is created for the first CU using the color palette color table, and the index in the first color index map is contained in the color palette color table;

A second color index map with an index is created for the second CU of the screen content using the color palette color table, and the index in the second color index map is contained in the color palette color table;

The color palette color table, the first color index map of the first CU, and the second color index map of the second CU are encoded into a bit stream.

2. The method according to claim 1, wherein the method is performed using a planar color format or an interlaced color format.

3. The method according to claim 2, wherein the method is performed at a level selected from CU level, strip level, image level or sequence level.

4. The method according to claim 1, wherein the color palette is obtained from the CU or an adjacent CU.

5. The method according to claim 4, wherein the color palette is obtained from adjacent CUs using CUs reconstructed in the pixel domain.

6. The method according to claim 5, further comprising: generating a palette color table of the adjacent CUs based on available reconstructable pixels, regardless of the depth and size of the CU.

7. The method according to claim 4, characterized in that it further comprises: generating a palette color table for the adjacent CUs, wherein the adjacent CUs are encoded using a palette pattern.

8. The method according to claim 4, wherein the adjacent CUs are not encoded using a palette pattern, and the palette color table of the adjacent CUs is passed from the CUs that were previously encoded using a palette pattern.

9. The method according to claim 1, wherein the palette color table is obtained in the pixel domain of the decoder, wherein the encoded bitstream is parsed to reconstruct the palette color table and color index map for the CU.

10. The method according to claim 6, wherein pixel values are obtained at each position of the CU by merging the color index and the palette color table.

11. The method according to claim 1, characterized in that it further comprises: classifying the color or pixel value of the CU based on the previously acquired color palette for the corresponding index.

12. The method according to claim 1, characterized in that it further comprises: writing new syntax elements into the encoded bit stream.

13. The method according to claim 1, characterized in that the color table of the palette is generated and arranged according to the bar chart or its actual color values.

14. The method according to claim 1, wherein each pixel of the CU is converted into a color index in the color palette color table.

15. The method according to claim 1, wherein a flag is defined for the CU to indicate whether the CU is processed in a filled form or a planar mode.

16. The method according to claim 1, wherein the palette color table is processed by encoding the size of the palette color table and each color in the palette color table.

17. The method according to claim 1, further comprising: generating a flag indicating whether the CU utilizes a palette color table of the left CU or the upper CU.

18. The method according to claim 1, characterized in that the index map is encoded using string matching selected from the group consisting of: one-dimensional (1-D) string matching, mixed 1-D string matching, and two-dimensional (2-D) string matching, wherein:

The string matching is performed using matching pairs.

19. The method according to claim 17, wherein the string matching is performed using a hashing method.

20. The method according to claim 1, characterized in that a two-dimensional (2D) search method is performed on the color index map by identifying the position of the current pixel and the position of the reference pixel in the CU as the starting point.

21. The method according to claim 1, wherein the CU has a 4:4:4 format processed using a downsampling 4:2:0 sampling format.

22. The method according to claim 21, wherein the downsampling format is processed at a level selected from CU level, strip level, image level, or sequence level.

23. A processor for encoding screen content into a bitstream, characterized in that the processor is used to:

24. The processor according to claim 23, wherein the color palette is obtained from the CU or an adjacent CU.

25. The processor according to claim 24, wherein the color palette is obtained from adjacent CUs using CUs reconstructed in the pixel domain.