CN103905818B

CN103905818B - Method for rapidly determining inter-frame prediction mode in HEVC standard based on Hough conversion

Info

Publication number: CN103905818B
Application number: CN201410161592.5A
Authority: CN
Inventors: 端木春江; 董朵
Original assignee: Zhejiang Normal University CJNU
Current assignee: Zhejiang Normal University CJNU
Priority date: 2014-04-13
Filing date: 2014-04-13
Publication date: 2017-02-15
Anticipated expiration: 2034-04-13
Also published as: CN103905818A

Abstract

The invention discloses a method for rapidly determining an inter-frame prediction coding mode in the latest national video compression standard, namely, the HEVC standard. The method includes the steps of firstly, conducting edge detection and Hough conversion on a current PU; secondly, conducting statistic analysis on the tangent value of the direction angle of a detected straight line so as to extract the most possible edge direction of the current PU according to the heuristic type information, optimally selecting candidate prediction modes so as to determine the MPM by conducting detection through the code rate and RMD process, and determining the final coding mode by conducting detection through the code rate and RDO process. Experiments show that the running speed of the HEVC can be increased by more than 20% according to the method.

Description

A fast determination method for intra prediction mode in HEVC standard based on Hough transform

技术领域technical field

本发明涉及到视频编码领域中目前国际上最新的压缩标准HEVC中加快其运行速度的方法，尤其涉及一种采用霍夫(Hough)变换等技术加快其帧内编码块的最优编码模式的确定的方法。The present invention relates to a method for speeding up the running speed of HEVC, the latest international compression standard in the field of video coding, and in particular to a method for speeding up the determination of the optimal coding mode of an intra-frame coding block by using technologies such as Hough transform Methods.

背景技术Background technique

HEVC(High Efficiency Video Coding)是目前最新的国际视频编码标准。它是由ISO/IEC移动图像专家组(MPEG)和ITU-T视频编码专家组(VCEG)联合制定的。它是继H.264/AVC之后的高性能视频编码标准。它的编码效率比H.264提高了50％以上。HEVC仍然采用预测加变换的混合编码框架，该框架中的技术较之前的技术比较来看，虽然大致的编码框架内容及流程并没有过多的改变，但是为了提高视频编码效率，HEVC标准中也加入了许多新的技术。HEVC (High Efficiency Video Coding) is the latest international video coding standard. It is jointly formulated by ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG). It is a high-performance video coding standard following H.264/AVC. Its encoding efficiency is more than 50% higher than that of H.264. HEVC still adopts the hybrid coding framework of prediction and transformation. Compared with the previous technology, the technology in this framework has not changed much, although the content and process of the general coding framework have not changed much, but in order to improve the video coding efficiency, the HEVC standard also Many new technologies have been added.

在H.264视频编码标准中，宏块(Micro Block，MB)是编码的最基本的单元。但在HEVC编码标准中，由于高清视频的普及，对达到2K、4K的超清视频，如果再采用这样固定大小的编码块的话，很难适应高清视频的需求。因此，HEVC没有继续采用MB的概念，而是采用更加灵活方式进行块结构表示：编码单元(CU coding unit)、预测单元(PU predictionunit)、变换单元(TU transform unit)。CU是最基本的分割单元，其最大编码尺寸是64×64，最小编码尺寸是8×8。PU是一个携带与预测过程相关信息的基本单元，PU尺寸小于或等于CU尺寸。一个CU可以包含一个或多个不同尺寸的预测单元PU，一个PU包含若干变换单元TU，TU是一个基于变换以及量化的基本单元。HEVC采用四叉树结构进行编码单元的划分。In the H.264 video coding standard, a macro block (Micro Block, MB) is the most basic unit of coding. However, in the HEVC coding standard, due to the popularity of high-definition video, it is difficult to meet the needs of high-definition video if such a fixed-size coding block is used for ultra-high-definition video reaching 2K and 4K. Therefore, HEVC does not continue to use the concept of MB, but uses a more flexible way to represent the block structure: coding unit (CU coding unit), prediction unit (PU prediction unit), and transform unit (TU transform unit). CU is the most basic division unit, its maximum coding size is 64×64, and its minimum coding size is 8×8. A PU is a basic unit that carries information related to the prediction process, and the PU size is smaller than or equal to the CU size. A CU may contain one or more prediction units PU of different sizes, and a PU may contain several transformation units TU, and a TU is a basic unit based on transformation and quantization. HEVC uses a quadtree structure to divide coding units.

HEVC为了克服了H.264/AVC中预测模式个数少、预测精度不高、不准确等缺点，HEVC大大增加了可采用的预测模式的数量。HEVC帧内预测共有35种预测模式，其中模式0表示使用平面(Planar)方式预测，模式1表示使用直流(DC)方式预测，模式2至34表示使用各种角度进行预测。In order to overcome the shortcomings of H.264/AVC, such as the small number of prediction modes, low prediction accuracy, and inaccuracy, HEVC greatly increases the number of available prediction modes. There are 35 prediction modes in HEVC intra prediction, among which mode 0 means to use planar (Planar) method for prediction, mode 1 means to use direct current (DC) method for prediction, and modes 2 to 34 represent to use various angles for prediction.

在HEVC的参考模型HM4.0中帧内模式的决策过程分为三个阶段。首先，对适合不同尺寸PU的模式集进行粗选，计算所有可能预测模式的率失真代价(Rate-Distortion Cost，Rd-Cost)，并将其按照代价递增顺序排列，Rd-Cost值最小的一组预测模式即是粗选子集，这个过程叫做粗选模式确定(RMD)过程。这里，计算失真时，仅考虑预测残差的失真，而不是编解码后的失真。其次，验证当前块左侧和上侧参考块的帧内预测模式是否包含在粗选子集中，若不在，则将其加入子集。最后，对粗选子集中的预测模式进行率失真优化，代价最小的模式即为最优预测模式，这个过程叫做码率和失真度优化(RDO)过程。这里，RDO过程中需要计算每个候选模式的编解码之后的失真，因此其计算量很大。由于HEVC中具有很多可选的帧内预测的模式，其帧内模式的确定过程具有很大的计算复杂度，非常耗时。In the reference model HM4.0 of HEVC, the decision-making process of the intra mode is divided into three stages. First, rough selection of mode sets suitable for PUs of different sizes, calculate the rate-distortion cost (Rate-Distortion Cost, Rd-Cost) of all possible prediction modes, and arrange them in order of increasing cost, the one with the smallest Rd-Cost value The group prediction mode is the rough selection subset, and this process is called the rough selection mode determination (RMD) process. Here, when calculating the distortion, only the distortion of the prediction residual is considered, not the distortion after encoding and decoding. Second, it is verified whether the intra prediction modes of the left and upper reference blocks of the current block are included in the rough selection subset, and if not, they are added to the subset. Finally, rate-distortion optimization is performed on the prediction modes in the rough selection subset, and the mode with the least cost is the optimal prediction mode. This process is called the rate and distortion optimization (RDO) process. Here, in the RDO process, it is necessary to calculate the distortion of each candidate mode after encoding and decoding, so the calculation amount is very large. Since there are many optional intra-frame prediction modes in HEVC, the process of determining the intra-frame mode has a great computational complexity and is very time-consuming.

综上所述，随着HEVC中帧内可选预测模式数量上的增加，虽然提高了预测精度和编码效率，但同时也大大增加了HEVC的编码复杂度。因而，帧内最优预测模式的确定时间被大大地增加了。To sum up, with the increase in the number of optional prediction modes in HEVC, although the prediction accuracy and coding efficiency are improved, the coding complexity of HEVC is also greatly increased. Thus, the determination time of the intra-optimal prediction mode is greatly increased.

本发明力求在保持预测精度的基础上，减少HEVC中帧内预测模式的确定时间。由于图像块中的直线的边缘信息能给我们带来块中像素的走势的方向，本发明将利用这一启发式信息来简化和进一步地减少候选的帧内预测模式。The present invention strives to reduce the time for determining the intra prediction mode in HEVC on the basis of maintaining the prediction accuracy. Since the edge information of the straight line in the image block can give us the trend direction of the pixels in the block, the present invention will use this heuristic information to simplify and further reduce the candidate intra prediction modes.

发明内容Contents of the invention

有鉴于现有技术的上述缺陷，本发明提出了一种基于Hough变换的快速帧内模式的确定的新方法。为实现在减少计算量的同时尽可能地保持预测的精度的目的，本发明将首先对图像进行边缘检测，然后采用Hough变换搜寻为直线的边缘，接下来确定当前图像预测单元PU内的直线边缘的角度，利用统计直方图确定当前PU最有可能的预测方向，并根据此方向确定最有可能的候选的预测模式，以在粗选模式确定(RMD)过程RMD和码率和失真度优化(RDO)过程中减少要检测的候选模式，大大地减少计算量，并同时对当前PU尽可能地选择好的预测模式。其特征在于，对于视频图像中的每个帧内预测单元PU，所述方法包括：In view of the above-mentioned defects in the prior art, the present invention proposes a new method for determining fast intra-frame modes based on Hough transform. In order to achieve the purpose of reducing the amount of calculation while maintaining the accuracy of prediction as much as possible, the present invention will first perform edge detection on the image, then use Hough transform to search for straight line edges, and then determine the straight line edges in the current image prediction unit PU The most likely prediction direction of the current PU is determined by using the statistical histogram, and the most likely candidate prediction mode is determined according to this direction, so as to optimize RMD and code rate and distortion in the rough mode determination (RMD) process ( In the process of RDO), the candidate modes to be detected are reduced, the calculation amount is greatly reduced, and at the same time, a good prediction mode is selected as much as possible for the current PU. It is characterized in that, for each intra prediction unit PU in the video image, the method includes:

步骤一，对当前PU，判断其尺寸是否为4×4。如是，则进行步长为1的模式亚采样，并直接跳到步骤八。否则，利用Canny算子对当前PU进行运算。Step 1: For the current PU, determine whether its size is 4×4. If yes, perform mode sub-sampling with a step size of 1, and directly skip to step eight. Otherwise, use the Canny operator to operate on the current PU.

步骤二，根据Canny算子的运算的结果，用形态学的方法对边缘进行细化。Step 2, according to the operation result of the Canny operator, the edge is thinned by the morphological method.

步骤三，选择合适的阈值进行PU中边缘点的提取和连接。Step 3, select an appropriate threshold to extract and connect edge points in the PU.

步骤四，对当前PU，利用Hough变换检测直线边缘像素点，即，保留处于某一直线上的边缘点，去除处于曲线上的边缘点。Step 4: For the current PU, use the Hough transform to detect straight line edge pixel points, that is, keep the edge points on a certain straight line and remove the edge points on the curve.

步骤五，对各直线边缘，确定其与水平方向上的角度差θ，和其正切值tanθ。Step five, for each straight line edge, determine its angle difference θ with the horizontal direction and its tangent value tanθ.

步骤六，根据各帧内预测模式的预测角度的正切值的范围，和检测出的当前PU中各直线边缘的正切值，用统计直方图确定当前PU中落在每一个范围内的直线边缘像素点之和。Step 6: According to the range of the tangent value of the prediction angle of each intra-frame prediction mode and the detected tangent value of each straight line edge in the current PU, use the statistical histogram to determine the straight line edge pixels falling within each range in the current PU sum of dots.

步骤七，根据此统计直方图，确定直方图中的直线边缘像素之和最大的几个预测模式。Step seven, according to the statistical histogram, determine several prediction modes with the largest sum of straight line edge pixels in the histogram.

步骤八，根据当前PU的尺寸，和得到的统计直方图，确定进入模式粗选过程RMD的候选的帧内模式。Step 8, according to the size of the current PU and the obtained statistical histogram, determine the candidate intra-mode for entering the rough mode selection process RMD.

步骤九，根据RMD过程，和当前PU的尺寸，确定进入模式精选过程RDO的候选的帧内模式。Step 9, according to the RMD process and the size of the current PU, determine the candidate intra-modes that enter the mode refinement process RDO.

步骤十，根据RDO的结果，确定当前PU的优化的帧内预测模式。Step ten, according to the result of RDO, determine the optimized intra prediction mode of the current PU.

进一步地，所述步骤一中，对4×4的块进行模式亚采样。在HEVC参考模型HM4.0中对于4×4PU适合的帧内预测模式共19种，方向性预测模式分别为：7、14、6、13、1、12、5、11、4、15、8、16、2、17、9、18、10。对4×4的块进行模式亚采样只保留一半的模式，即选择模式7、6、1、5、4、8、2、9、10。Further, in the first step, mode sub-sampling is performed on the 4×4 block. In the HEVC reference model HM4.0, there are 19 suitable intra prediction modes for 4×4PU, and the directional prediction modes are: 7, 14, 6, 13, 1, 12, 5, 11, 4, 15, 8 , 16, 2, 17, 9, 18, 10. Mode subsampling on a 4×4 block retains only half of the modes, ie select modes 7, 6, 1, 5, 4, 8, 2, 9, 10.

Canny算子边缘检测步骤如下：首先应用高斯滤波器平滑图像，在3×3邻域内求有限差分均值计算灰度梯度的幅值和方向。The steps of Canny operator edge detection are as follows: Firstly, Gaussian filter is applied to smooth the image, and the mean value of the finite difference is calculated in the 3×3 neighborhood to calculate the magnitude and direction of the gray gradient.

进一步地，在所述步骤二中，采用形态学的方法对检测出的宽度粗的边缘进行细化，即，进行形态学的腐蚀运算，其腐蚀的模板元素为Further, in the second step, the detected edge with a thick width is thinned using a morphological method, that is, a morphological corrosion operation is performed, and the eroded template element is

进一步地，所述步骤三中，采用阈值T_h进行边缘点提取。采用迭代法选取最佳阈值。迭代法实现步骤如下：Further, in the third step, edge point extraction is performed by using threshold _{Th h} . An iterative method is used to select the optimal threshold. The implementation steps of the iterative method are as follows:

(1)求出图像当前PU中最小值Z_min、最大值Z_max，则初始阀值T₀＝(Z_min+Z_max)/2。(1) Calculate the minimum value Z _min and maximum value Z _max in the current PU of the image, then the initial threshold T ₀ =(Z _min +Z _max )/2.

(2)根据阀值T_k(k为迭代次数)将图像分割成边缘区域和非边缘区域，计算两部分的平均值Z_O、Z_B，即(2) According to the threshold T _k (k is the number of iterations), the image is divided into edge area and non-edge area, and the average value Z _O and Z _B of the two parts are calculated, namely

这里，h_i和h_j分别表示边缘和非边缘区域的像素点的值。计算出Z_O、Z_B后，用式子Here, h _i and h _j denote the values of pixels in the edge and non-edge regions, respectively. After calculating Z _O and Z _B , use the formula

T_k+1＝(Z_O+Z_B)/2T _k+1 ＝(Z _O +Z _B )/2

来计算出新阀值T_k+1。to calculate the new threshold T _k+1 .

(3)若T_k+1＝T_k，即T_k为所求阈值，则算法到此结束，否则转到步骤(2)。迭代计算直至收敛于某个稳定的阈值，此阈值即为最终结果T_h。(3) If T _k+1 =T _k , ie T _k is the desired threshold, then the algorithm ends here, otherwise go to step (2). The calculation is iterative until it converges to a stable threshold, which is the final result T _h .

这里，采用双阈值方法从候选边缘点中检测和连接最终边缘。在边缘点的提取时，所采用的阈值为T_h，边缘点的连接时采用的阈值应当比T_h小，以连接边缘。本发明中在边缘点的连接时采用的阈值为T_h/2。即，经第一和第二步处理后，值大于T_h的像素点为边缘点。然后以这些边缘点为种子，寻找与其相邻的值大于T_h/2的像素点，加为边缘像素点。然后，重复这一过程，直到PU中所有的像素点都扫描过一遍。Here, a dual-threshold method is employed to detect and connect the final edges from candidate edge points. When extracting edge points, the threshold used is T _h , and when connecting edge points, the threshold used should be smaller than T _h in order to connect edges. In the present invention, the threshold used in the connection of edge points is _Th /2. That is, after the first and second steps of processing, the pixel points whose value is greater than T _h are edge points. Then, using these edge points as seeds, search for adjacent pixel points with a value greater than _{Th h} /2, and add them as edge pixel points. Then, repeat this process until all the pixels in the PU are scanned once.

表1. 8×8、16×16、32×32尺寸PU的各个预测模式的预测角度、边界角度范围和正切范围Table 1. Prediction angle, boundary angle range and tangent range of each prediction mode of 8×8, 16×16, 32×32 size PU

表2. 64×64尺寸PU的各个预测模式的预测角度、边界角度范围和正切范围Table 2. Prediction angle, boundary angle range and tangent range of each prediction mode of 64×64 size PU

预测模式predictive model 预测角度Forecast angle 边界角度范围Boundary angle range 正切值范围Tangent range 11 -90-90 (-45，-135)(-45, -135) (-∞，-1)U(1，+∞)(-∞, -1)U(1, +∞) 22 00 [-45，45][-45, 45] [-1，1][-1, 1]

进一步地，在步骤六中，对直线边缘角度的正切值的统计范围如表1和表2所示。例如，如表1所示，对于候选模式7，其角度正切值的统计范围为(-0.7417，-1.1033]。即，初始的此范围内的直线像素点值置0，C₇＝0，然后如果有直线边缘的角度的正切值落在此范围，则C₇＝C₇+a_i，其中a_i是此条直线上在当前PU中的所有像素点数。对所有的候选模式都进行此处理，则得到了所要的各模式的统计直方图。因为模式0、模式3是非方向性预测模式，故未被统计在表1、表2中，它们都被作为候选模式进入到RDO的粗选。Further, in step six, the statistical range of the tangent of the straight line edge angle is shown in Table 1 and Table 2. For example, as shown in Table 1, for candidate pattern 7, the statistical range of its angle tangent value is (-0.7417,-1.1033]. That is, the initial linear pixel point value in this range is set to 0, C ₇ =0, and then If the tangent value of the angle of the edge of a straight line falls within this range, then C ₇ =C ₇ +a _i , where a _i is the number of all pixels on this straight line in the current PU. This process is performed on all candidate modes , then the statistical histogram of each desired mode is obtained. Because mode 0 and mode 3 are non-directional prediction modes, they are not counted in table 1 and table 2, and they are all entered into the rough selection of RDO as candidate modes.

表3.HM4.0各PU尺寸中进入到RMD粗选过程的候选的预测模式的个数Table 3. The number of candidate prediction modes entering the RMD rough selection process in each PU size of HM4.0

4×44×4 1919 8×88×8 3535 16×1616×16 3535 32×3232×32 3535 64×6464×64 44

表4.本发明各PU尺寸中进入到RMD粗选过程的候选的预测模式的个数Table 4. The number of candidate prediction modes entering the RMD rough selection process in each PU size of the present invention

4×44×4 99 8×88×8 77 16×1616×16 55 32×3232×32 33 64×6464×64 11

进一步地，步骤八的过程如表4所示。例如，对于16×16尺寸的PU，将选择5个在以上统计直方图中值最大的候选模式。从表3和表4的对比中，可以看出本发明大大减少了进入到RMD粗选过程的候选的模式的数目。这样，可以大大地减少计算量，加速HEVC标准的运行速度。Further, the process of Step 8 is shown in Table 4. For example, for a PU with a size of 16×16, 5 candidate modes with the largest value in the above statistical histogram will be selected. From the comparison of Table 3 and Table 4, it can be seen that the present invention greatly reduces the number of candidate patterns entering the RMD rough selection process. In this way, the calculation amount can be greatly reduced, and the operation speed of the HEVC standard can be accelerated.

表5.HM4.0各PU尺寸中进入到RDO精选过程的候选的预测模式的个数Table 5. The number of candidate prediction modes entering the RDO selection process in each PU size of HM4.0

4×44×4 88 8×88×8 88 16×1616×16 33 32×3232×32 33 64×6464×64 33

表6.本发明各PU尺寸中进入到RDO精选过程的候选的预测模式的个数Table 6. The number of candidate prediction modes entering the RDO selection process in each PU size of the present invention

4×44×4 55 8×88×8 44 16×1616×16 33 32×3232×32 22 64×6464×64 11

进一步地，步骤九的过程如表6所示。例如，对于16×16尺寸的PU，将选择在RMD过程中最优的3个候选模式进入到最终的RDO的选择过程中。从表5和表6的对比中，可以看出本发明大大减少了进入到RDO精选过程的候选的模式的数目。这样，可以大大地减少计算量，加速HEVC标准的运行速度。Further, the process of step nine is shown in Table 6. For example, for a PU with a size of 16×16, the best 3 candidate modes in the RMD process will be selected to enter the final RDO selection process. From the comparison of Table 5 and Table 6, it can be seen that the present invention greatly reduces the number of candidate patterns entering the RDO refinement process. In this way, the calculation amount can be greatly reduced, and the operation speed of the HEVC standard can be accelerated.

综上所述，本发明提出了一种国际上目前最新编码标准HEVC中的帧内预测的新方法，以提高其运行速度。该方法的创新点在于：利用Canny算子进行边缘检测，并利用Hough变换进行直线边缘检测，以根据这些启发式信息，提取出当前预测单元最可能的边缘走向。然后，根据这些边缘走向来选择和减少需要检测的候选的预测模式。以此，来大大减少HEVC的编码的计算复杂度，提高其运行速度。To sum up, the present invention proposes a new method for intra-frame prediction in HEVC, the latest international coding standard, so as to improve its running speed. The innovation of this method lies in: use Canny operator for edge detection, and use Hough transform for line edge detection, so as to extract the most likely edge direction of the current prediction unit according to these heuristic information. Then, according to these edge trends, the candidate prediction modes to be detected are selected and reduced. In this way, the computational complexity of HEVC encoding is greatly reduced and its running speed is improved.

以下将结合附图对本发明的构思、具体结构及产生的技术效果作进一步说明，以充分地了解本发明的目的、特征和效果。The idea, specific structure and technical effects of the present invention will be further described below in conjunction with the accompanying drawings, so as to fully understand the purpose, features and effects of the present invention.

附图说明Description of drawings

图1是HEVC标准中的帧内预测模式的示意图；FIG. 1 is a schematic diagram of an intra prediction mode in the HEVC standard;

图2是本发明所提出的新方法的流程图；Fig. 2 is the flow chart of the new method that the present invention proposes;

图3是本发明所提出的方法和原有方法(HM4.0)对视频测试序列Party Scene(832×480p分辨率)的码率和失真度性能曲线比较图；Fig. 3 is that the method proposed by the present invention and original method (HM4.0) compare the code rate and distortion performance curve of video test sequence Party Scene (832 * 480p resolution);

图4是本发明所提出的方法和原有方法(HM4.0)对视频测试序列Slide Editing(720p分辨率)的码率和失真度性能曲线比较图；Fig. 4 is that the method proposed by the present invention and original method (HM4.0) are to the bit rate of video test sequence Slide Editing (720p resolution) and the comparative figure of distortion performance curve;

图5是本发明所提出的方法和原有方法(HM4.0)对视频测试序列Kimono1(1080p分辨率)的码率和失真度性能曲线比较图；Fig. 5 is that the method proposed by the present invention and original method (HM4.0) compare the code rate and the degree of distortion performance curve of video test sequence Kimono1 (1080p resolution);

图6是本发明所提出的方法和原有方法(HM4.0)对视频测试序列SteamLocomotiveTrain(4k×2k分辨率)的码率和失真度性能曲线比较图；Fig. 6 is that the method proposed by the present invention and original method (HM4.0) compare the code rate and distortion performance curve of video test sequence SteamLocomotiveTrain (4k * 2k resolution);

具体实施方式detailed description

下面结合附图对本发明的实施例作详细说明：本实施例在以本发明技术方案前提下进行实施，给出了详细的实施方式和具体的操作过程，但本发明的保护范围不限于下述的实施例。Below in conjunction with accompanying drawing, the embodiment of the present invention is described in detail: present embodiment implements under the premise of the technical scheme of the present invention, has provided detailed implementation and specific operation process, but protection scope of the present invention is not limited to the following the embodiment.

从附图1中可以看出原有的HEVC的帧内预测具有很多候选模式，确定较优化的模式具有较大的计算复杂度。It can be seen from FIG. 1 that the original HEVC intra prediction has many candidate modes, and determining a more optimized mode has relatively large computational complexity.

本发明利用C++语言在HEVC的HM4.0的程序的基础上，利用OpenCV函数库，对所提出的方法进行了实现和实验。The present invention utilizes the C++ language on the basis of the HM4.0 program of HEVC, and utilizes the OpenCV function library to realize and experiment the proposed method.

如附图2所示，本发明的降低其帧内预测的计算复杂度的方法包括如下步骤：As shown in accompanying drawing 2, the method for reducing the computational complexity of its intra prediction of the present invention comprises the following steps:

步骤一，对当前PU，判断其尺寸是否为4×4。如是，则进行步长为1的模式亚采样，并直接跳到步骤八。否则，利用Canny算子对当前PU进行运算。对于4×4大小的PU，因为其尺寸太小，从中提取出的直线边缘方向和预测模式的方向的相关性不是特别大，因此直接跳到步骤八。对于尺寸大于4×4大小的PU，从中提取出的直线边缘方向和预测模式的方向的相关性是很大的，因此可以用来确定要检测的候选模式。在OpenCV中通过函数cvCanny访问Canny算子进行边缘检测，其函数调用原型为void cvCanny(const CvArr* image，CvArr*edges，double threshold1，double threshold2，int aperture_size＝3)。函数cvCanny采用Canny算法搜索输入图像的边缘而且输出图像中标识的这些边缘。Step 1: For the current PU, determine whether its size is 4×4. If yes, perform mode sub-sampling with a step size of 1, and directly skip to step eight. Otherwise, use the Canny operator to operate on the current PU. For a 4×4 PU, because its size is too small, the correlation between the direction of the straight line edge extracted from it and the direction of the prediction mode is not particularly large, so skip directly to step 8. For a PU whose size is larger than 4×4, the correlation between the direction of the straight line edge extracted from it and the direction of the prediction mode is very large, so it can be used to determine the candidate mode to be detected. In OpenCV, the function cvCanny is used to access the Canny operator for edge detection. The function call prototype is void cvCanny(const CvArr* image, CvArr*edges, double threshold1, double threshold2, int aperture_size=3). The function cvCanny uses the Canny algorithm to search for edges in the input image and outputs these edges identified in the image.

步骤二，根据Canny算子的运算的结果，用形态学的方法对边缘进行细化。即用如下Step 2, according to the operation result of the Canny operator, the edge is thinned by the morphological method. ready to use as follows

模板元素对宽度大于3的边缘进行形态学中的腐蚀操作。The template element performs morphological erosion on edges with width greater than 3.

步骤三，选择合适的阈值进行PU中边缘点的提取和连接。这里，采用阈值T_h进行边缘点的提取，并用迭代法选取最佳阈值。迭代法的实现步骤如下：Step 3, select an appropriate threshold to extract and connect edge points in the PU. Here, threshold _Th is used to extract edge points, and an iterative method is used to select the best threshold. The implementation steps of the iterative method are as follows:

求出图像当前PU中最小值Z_min、最大值Z_max，则初始阀值T₀＝(Z_min+Z_max)/2。Calculate the minimum value Z _min and maximum value Z _max in the current PU of the image, then the initial threshold T ₀ =(Z _min +Z _max )/2.

根据阀值T_k(k为迭代次数)将图像分割成边缘区域和非边缘区域，计算两部分的平均值Z_O、Z_B，即According to the threshold T _k (k is the number of iterations), the image is divided into edge area and non-edge area, and the average value Z _O and Z _B of the two parts are calculated, namely

T_k+1＝(Z_O+Z_B)/2T _k+1 ＝(Z _O +Z _B )/2

来计算出新阀值T_K+1。to calculate the new threshold T _K+1 .

若T_k+1＝T_K，即T_k为所求阈值，则算法到此结束，否则转到步骤(2)。迭代计算直至收敛于某个稳定的阈值，此阈值即为最终结果T_h。If T _k+1 =T _K , that is, T _k is the desired threshold, then the algorithm ends here, otherwise go to step (2). The calculation is iterative until it converges to a stable threshold, which is the final result T _h .

步骤四，对当前PU，利用Hough变换检测直线边缘像素点，即，保留处于某一直线上的边缘点，去除处于曲线上的边缘点。OpenCV中通过函数cvHoughLines2进行直线段的检测。cvHoughLines2的函数调用原型是CvSeq* cvHoughLines2(CvArr* image，void* linestorage，int method，double rho，double theta，int threshold，double param1，doubleparam2)。本发明将参数method设置成CV_HOUGH_PROBABILISTIC表示选择概率霍夫变换，进行直线段的检测。threshold为阈值参数，如果相应累计值大于它，函数会返回这个线段。param1表示最小线段长度，param2表示在同一条直线上进行线段连接的最大间隔值。Step 4: For the current PU, use the Hough transform to detect straight line edge pixel points, that is, keep the edge points on a certain straight line and remove the edge points on the curve. In OpenCV, the function cvHoughLines2 is used to detect straight line segments. The function call prototype of cvHoughLines2 is CvSeq* cvHoughLines2(CvArr* image, void* linestorage, int method, double rho, double theta, int threshold, double param1, doubleparam2). In the present invention, the parameter method is set to CV_HOUGH_PROBABILISTIC to indicate that the probability Hough transform is selected to detect the straight line segment. threshold is the threshold parameter, if the corresponding cumulative value is greater than it, the function will return this line segment. param1 indicates the minimum line segment length, and param2 indicates the maximum interval value for connecting line segments on the same straight line.

步骤六，根据各帧内预测模式的预测角度的正切值的范围，和检测出的当前PU中各直线边缘的正切值，用统计直方图确定当前PU中落在每一个范围内的直线边缘像素点之和。对直线边缘角度的正切值的统计范围如表1和表2所示。例如，如表1所示，对于候选模式7，其角度正切值的统计范围为(-0.7417，-1.1033]。即，初始的此范围内的直线像素点值置0，C₇＝0，然后如果有直线边缘的角度的正切值落在此范围(-0.7417，-1.1033]中，则C₇＝C₇+a_i，其中a_i是此条直线上在当前PU中的所有像素点数。对所有的候选模式都进行此处理，则得到了所要的各模式的统计直方图。因为模式0、模式3是非方向性预测模式，故未被统计在表1、表2中，它们都固定作为候选模式进入到RDO的粗选。Step 6: According to the range of the tangent value of the prediction angle of each intra-frame prediction mode and the detected tangent value of each straight line edge in the current PU, use the statistical histogram to determine the straight line edge pixels falling within each range in the current PU sum of dots. Table 1 and Table 2 show the statistical range of the tangent value of the straight line edge angle. For example, as shown in Table 1, for candidate pattern 7, the statistical range of its angle tangent value is (-0.7417,-1.1033]. That is, the initial linear pixel point value in this range is set to 0, C ₇ =0, and then If the tangent value of the angle of a straight line edge falls within this range (-0.7417, -1.1033], then C ₇ =C ₇ +a _i , where a _i is the number of all pixels on this straight line in the current PU. Right All candidate modes are all carried out this processing, then obtained the statistical histogram of each mode that wants.Because mode 0, mode 3 are non-directional prediction modes, so are not counted in table 1, table 2, they are all fixed as candidate Mode entry into RDO's rough selection.

步骤八，根据当前PU的尺寸，和得到的统计直方图，确定进入模式粗选过程RMD的候选的帧内模式。进一步地，对某一尺寸的PU，其选择的进入RDO检测的候选的预测模式的数目如表6所示。例如，对于16×16尺寸的PU，将选择5个在以上统计直方图中值最大的候选模式。从表3和表4的对比中，可以看出本发明大大减少了进入到RMD粗选过程的候选的模式的数目。这样，可以大大地减少计算量，加速HEVC标准的运行速度。Step 8, according to the size of the current PU and the obtained statistical histogram, determine the candidate intra-mode for entering the rough mode selection process RMD. Further, for a PU of a certain size, the number of candidate prediction modes selected for RDO detection is shown in Table 6. For example, for a PU with a size of 16×16, 5 candidate modes with the largest value in the above statistical histogram will be selected. From the comparison of Table 3 and Table 4, it can be seen that the present invention greatly reduces the number of candidate patterns entering the RMD rough selection process. In this way, the calculation amount can be greatly reduced, and the operation speed of the HEVC standard can be accelerated.

步骤九，根据RMD过程，和当前PU的尺寸，确定进入模式精选过程RDO的候选的帧内模式。进一步地，对某一尺寸的PU，其选择的进入RDO检测的候选的预测模式的数目如表6所示。例如，对于16×16尺寸的PU，将选择在RMD过程中最优的3个候选模式进入到最终的RDO的选择过程中。从表5和表6的对比中，可以看出本发明大大减少了进入到RDO粗选过程的候选的模式的数目。这样，可以大大地减少计算量，加速HEVC标准的运行速度。Step 9, according to the RMD process and the size of the current PU, determine the candidate intra-modes that enter the mode refinement process RDO. Further, for a PU of a certain size, the number of candidate prediction modes selected for RDO detection is shown in Table 6. For example, for a PU with a size of 16×16, the best 3 candidate modes in the RMD process will be selected to enter the final RDO selection process. From the comparison of Table 5 and Table 6, it can be seen that the present invention greatly reduces the number of candidate patterns entering the RDO rough selection process. In this way, the calculation amount can be greatly reduced, and the operation speed of the HEVC standard can be accelerated.

本发明按照通用的测试环境，对所提出的方法进行了测试。编码格式为全I帧结构，测试QP点为37、32、27、22，使用自适应二进制算术编码CABAC(Context-based AdaptiveBinary Arithmetic Coding)对视频序列进行熵编码。考虑到HEVC未来会广泛应用到高分辨率视频序列中，测试主要以高清和标清测试序列为主，主要有25601×600p、1920×1080p、1280×720p和832×480p，共4种尺寸的视频测试序列。这里主要从所提出的新方法带来的编码时间的减少及所付出的相应代价来考虑该方法的性能。性能指标主要由BD-PSNR/Rate、ΔB、ΔP及ΔT等参数来具体地表达所提出方法的性能。The present invention tested the proposed method according to the common test environment. The encoding format is all I-frame structure, the test QP points are 37, 32, 27, and 22, and the video sequence is entropy encoded using CABAC (Context-based Adaptive Binary Arithmetic Coding). Considering that HEVC will be widely used in high-resolution video sequences in the future, the test is mainly based on high-definition and standard-definition test sequences, mainly including 25601×600p, 1920×1080p, 1280×720p and 832×480p, a total of 4 video sizes test sequence. Here we mainly consider the performance of this method from the reduction of coding time brought by the proposed new method and the corresponding cost. The performance index mainly expresses the performance of the proposed method by parameters such as BD-PSNR/Rate, ΔB, ΔP, and ΔT.

表7.本发明所提出的方法的实验测试结果Table 7. Experimental test results of the proposed method of the present invention

实验结果如表7所示，所有数据结果中若为正数，则表示此指标下本发明的方法相对于原方法其数值是增长的，若为负数，则表示此指标下，本发明的方法相对于原方法其数值是降低的。测试性能的指标的定义如下：Experimental result is as shown in table 7, if it is a positive number in all data results, it means that the method of the present invention under this index is increased relative to its numerical value of the original method, if it is a negative number, it means that under this index, the method of the present invention Compared with the original method, its value is reduced. The metrics for testing performance are defined as follows:

(1)节省的编码时间ΔT，即(1) The saved encoding time ΔT, namely

其中，T_HM4.0表示使用HM4.0现有算法的编码时间，T_Proposed是文中所提算法的编码时间。获得同等编码效率条件下ΔT越大，表示编码端计算复杂度降低越多，所提出的方法的性能越好。Among them, T _HM4.0 represents the encoding time of using the existing algorithm of HM4.0, and T _Proposed is the encoding time of the algorithm proposed in this paper. Under the condition of obtaining the same coding efficiency, the larger ΔT means that the computational complexity of the coding end is reduced more, and the performance of the proposed method is better.

(2)所提出的方法与HM4.0帧内预测算法码率差值ΔB，即(2) The code rate difference ΔB between the proposed method and the HM4.0 intra prediction algorithm, namely

其中，B_HM4.0表示现有HM4.0方法的编码码率，B_Proposed是本发明所提出方法的编码码率。ΔB越小，表示提出的方法与HM4.0现有方法的编码性能越接近。Wherein, B _HM4.0 represents the coding rate of the existing HM4.0 method, and B _Proposed is the coding rate of the method proposed by the present invention. The smaller the ΔB, the closer the coding performance of the proposed method to the existing method of HM4.0.

(3)所提出的方法与HM4.0帧内预测方法亮度分量的峰值信噪比(Peak Signal toNoise Ratio，PSNR)的差值ΔP，即(3) The difference ΔP between the proposed method and the peak signal-to-noise ratio (Peak Signal to Noise Ratio, PSNR) of the luminance component of the HM4.0 intra prediction method, namely

ΔP＝P_Proposed-P_HM4.0 ΔP＝P _Proposed -P _HM4.0

其中，P_HM4.0表示HM4.0现有方法的亮度分量的峰值信噪比PSNR，P_Proposed是本发明所提出方法的亮度分量的PSNR。ΔP越小，表示提出的方法与HM4.0现有方法的编码性能越接近。Wherein, P _HM4.0 represents the peak signal-to-noise ratio PSNR of the luminance component of the existing method of HM4.0, and P _Proposed is the PSNR of the luminance component of the method proposed by the present invention. The smaller the ΔP, the closer the coding performance of the proposed method to the existing method of HM4.0.

BD-Rate表示同样峰值信噪比(PSNR)下，码率的增加值。BD-PSNR表示同样码率下，PSNR的减少值。BD-Rate indicates the increase value of the code rate under the same peak signal-to-noise ratio (PSNR). BD-PSNR indicates the reduction value of PSNR under the same code rate.

从实验结果表7中能够看出，所提出的方法只有极小的峰值信噪比(PSNR)的降低及极小的码率增加，因此其具有较好的性能。同时，由此表可以看出，本发明所提出的方法相对于HM4.0里的标准方法可节省最小16.15％、最大可达到32.73％的编码时间。It can be seen from the experimental results in Table 7 that the proposed method has only a very small decrease in peak signal-to-noise ratio (PSNR) and a very small increase in code rate, so it has better performance. At the same time, it can be seen from the table that the method proposed by the present invention can save the encoding time by a minimum of 16.15% and a maximum of 32.73% compared with the standard method in HM4.0.

由实验结果得到的对所提出方法和原方法(HM4.0)对不同分辨率下的视频测试序列的码率和失真度(Rate-Distortion)的曲线如图3、4、5、6所示。这些图的横坐标为码率，单位为每秒需传输的千个比特数(kbps)。这些图的纵坐标为峰值信噪比(PSNR)，单位为dB。从这些图中，可以看出本发明所提出的方法和原方法的码率和失真度的两条曲线基本上相同。由此，可以说明所提出的方法保持了预测模式的精度，在相同的码率的条件下，没有恶化或比较大地降低失真度。The curves of bit rate and distortion (Rate-Distortion) of the proposed method and the original method (HM4.0) for video test sequences at different resolutions obtained from the experimental results are shown in Figures 3, 4, 5, and 6 . The abscissa of these graphs is the code rate, and the unit is the number of thousand bits to be transmitted per second (kbps). The ordinate of these graphs is the peak signal-to-noise ratio (PSNR) in dB. From these figures, it can be seen that the two curves of code rate and distortion degree of the method proposed by the present invention and the original method are basically the same. From this, it can be explained that the proposed method maintains the accuracy of the prediction mode, and under the same code rate condition, the degree of distortion is not deteriorated or is relatively greatly reduced.

从表7和图3、4、5、6中，可以看出，所提出的方法相对于原有的方法，可平均节省23％的编码时间。同时，其码率和PSNR的波动较小，相对于节省的计算代价而言，性能上的降低几乎可以忽略。可见，本发明所提出的方法具有较好的性能。From Table 7 and Figures 3, 4, 5, and 6, it can be seen that the proposed method can save an average of 23% of the encoding time compared to the original method. At the same time, the fluctuation of its code rate and PSNR is small, and the reduction in performance is almost negligible compared to the computational cost saved. It can be seen that the method proposed by the present invention has better performance.

以上详细描述了本发明的较佳具体实施例。应当理解，本领域的普通技术无需创造性劳动就可以根据本发明的构思作出诸多修改和变化。因此，凡本技术领域中技术人员依本发明的构思在现有技术的基础上通过逻辑分析、推理或者有限的实验可以得到的技术方案，皆应在由权利要求书所确定的保护范围内。The preferred specific embodiments of the present invention have been described in detail above. It should be understood that those skilled in the art can make many modifications and changes according to the concept of the present invention without creative efforts. Therefore, all technical solutions that can be obtained by those skilled in the art based on the concept of the present invention through logical analysis, reasoning or limited experiments on the basis of the prior art shall be within the scope of protection defined by the claims.

Claims

1. A method for quickly determining an intra-frame prediction coding mode in the international video compression standard HEVC, the method comprising:

Step 1. For the current prediction unit (PU), judge whether its size is 4×4. If so, perform mode sub-sampling with a step size of 1, and directly skip to step 8; otherwise, use the Canny operator to perform operation;

Step 2, according to the result of the operation of the Canny operator, refine the edge with a morphological method;

Step 3, select an appropriate threshold to extract and connect edge points in the PU;

Step 4, for the current PU, use Hough transform to detect straight line edge pixel points, that is, keep the edge points on a certain straight line, and remove the edge points on the curve;

Step five, for each straight line edge, determine its angle difference θ with the horizontal direction, and its tangent value tanθ;

Step 6: According to the range of the tangent value of the prediction angle of each intra-frame prediction mode and the detected tangent value of each straight line edge in the current PU, use the statistical histogram to determine the straight line edge pixels falling within each range in the current PU sum of points;

Step 7, according to the statistical histogram, determine several prediction modes with the largest sum of straight line edge pixels in the histogram;

Step 8, according to the size of the current PU and the obtained statistical histogram, determine the candidate intra-mode for entering the mode rough selection process (RMD);

Step 9, according to the RMD process and the size of the current PU, determine the candidate intra-mode for entering the mode selection process (RDO);

Step ten, according to the result of RDO, determine the optimized intra prediction mode of the current PU.

2. The method as claimed in claim 1, adopting a morphological method in said step 2 to refine the detected width greater than 3 edges, that is, to carry out morphological corrosion operations, the template elements of which are corroded are

[\begin{matrix} 00 & 11 & 00 \\ 11 & 11 & 11 \\ 00 & 11 & 00 \end{matrix}] . .

3. method as claimed in claim 1, adopt threshold value _Th to carry out edge point extraction in described step 3, adopt iterative method to choose optimal threshold value, iterative method realization step is as follows:

(1) Find the minimum value Z _min and the maximum value Z _max in the current PU of the image, then the initial threshold T ₀ = (Z _min + Z _max) /2,

(2) According to the threshold T _k (k is the number of iterations), the image is divided into edge pixels and non-edge pixels, and the average value Z _O and Z _B of the two parts are calculated, namely

{Z Z}_{O o} = = {Σ Σ}_{i i = = 00}^{{T T}_{k k}} {h h}_{i i} \cdot \cdot i i / / {Σ Σ}_{i i = = 00}^{{T T}_{k k}} {h h}_{i i}

{Z Z}_{B B} = = {Σ Σ}_{j j = = {T T}_{k k} + + 11}^{255255} {h h}_{j j} \cdot \cdot j j / / {Σ Σ}_{j j = = {T T}_{k k} + + 11}^{255255} {h h}_{j j}

Here, h _i and h _j represent the values of edge pixels and non-edge pixels respectively. After calculating Z _O and Z _B , use the formula

T _k+1 ＝(Z _O +Z _B )/2

to calculate the new threshold T _k+1 ;

(3) If T _k+1 = T _k , that is, T _k is the desired threshold, the algorithm ends here; otherwise, go to step (2) and iteratively calculate until it converges to a stable threshold, which is the final Result T _h .

4. The method according to claim 1, when extracting edge points in said step 3, the threshold used is _Th , and the threshold used when connecting edge points should be smaller than _Th to connect the edges; The threshold used in the connection of edge points in the present invention is T _h /2, that is, after the first and second steps of processing, the pixel points with a value greater than T _h are edge points; then use these edge points as seeds to find The adjacent pixels whose value is greater than T _h /2 are added as edge pixels; then, this process is repeated until all the pixels in the PU are scanned once.

5. The method according to claim 1, in said step 6, according to the range of the tangent of the prediction angle of each intra-frame prediction mode, and the detected tangent of each straight line edge in the current PU, use a statistical histogram Determine the sum of straight-line edge pixels falling within each range in the current PU to determine the edge direction of the current PU.

6. The method as claimed in claim 1, in said step eight, to sizes 4×4, 8×8, 16×16, 32×32, and 64×64, respectively select 9, 7, 5, 3, One candidate mode enters the RMD detection process; among them, for prediction units with PU sizes of 8×8, 16×16, 32×32, and 64×64, the candidate with the largest sum of straight-line edge pixels in the statistical histogram is selected Patterns are entered into the RMD detection process to determine the patterns that enter the RMD detection process.

7. The method according to claim 1, in said step 9, for sizes 4×4, 8×8, 16×16, 32×32, and 64×64, respectively select 5, 4, 3, 2, One candidate mode enters the RDO detection process; after the selection standard is the RMD process, the candidate mode with a small code rate and prediction distortion enters the RDO detection process to determine the candidate mode that enters the RDO detection process.