TW200910967A

TW200910967A - Low complexity macroblock mode decision and motion search method for scalable video coding with combined coarse granular scalability (CGS) and temporal scalability

Info

Publication number: TW200910967A
Application number: TW96131978A
Authority: TW
Inventors: Hung-Chih Lin; Wen-Hsiao Peng; Hsueh-Ming Hang
Original assignee: Univ Nat Chiao Tung
Priority date: 2007-08-27
Filing date: 2007-08-27
Publication date: 2009-03-01
Also published as: TWI350696B

Abstract

In the present invention, to speed up the encoder while minimizing the loss in coding efficiency, the computational redundancy between the coding layers is considered. Depending on the macroblock (MB) coding modes and the quantization parameters (Qp) of the reference/base layer, a look-up table is recursively used to determine the MB modes to be tested at the enhancement layers. In addition, to avoid exhaustive motion estimation, the reference frame indices of the base layer are adaptively reused, and according to the MB partition at the enhancement layer, the initial search point for motion estimation is properly selected from the motion vector at the base layer or the motion vector predictor at the enhancement layer. The proposed schemes are tested with standard sequences in CIF and 4CIF resolutions using 1 base layer, 3 CGS layers, 3 reference frames, and GOP sizes of 8 and 16. As compared with the mode decision algorithm in JSVM 8, the proposed schemes averagely provide 76% improvement in overall encoding time with an average increase of bit-rate below 1%, and an average Y-PSNR loss below 0.01dB.

Description

200910967 九、發明說明：【發明所屬之技術領域】本發明關於一種適用於通訊、視訊影像及多媒體編碼之演算法，特關於一種適用於粗略與時間可調性視訊編碼之低複雜度宏塊模式決定與動態向量估計之快速演算法。【先前技術】在本發明所屬領域之先前技術中，H. Li, Z.-G. Li，及 CWen η '200910967 IX. Description of the Invention: [Technical Field of the Invention] The present invention relates to an algorithm suitable for communication, video image and multimedia coding, and relates to a low complexity macroblock mode suitable for coarse and time adjustable video coding. Decide on a fast algorithm for dynamic vector estimation. [Prior Art] In the prior art to which the present invention pertains, H. Li, Z.-G. Li, and CWen η '

Fast Mode Decision for Coarse Grain SNR Scalable Video Coding,” IEEE ICASSP, 2006 (下稱先前技術 1 ) ' H. Li, Z.-G. Li, 及 C.Wen, Fast Mode Decision for Spatial Scalable Video Coding," IEEE IS CAS, 2006 (下稱先前技術 2 )、H . L i，Z . - G · L i，與 C. Wen, Fast Mode Decision Algorithm for Inter-Frame Coding in Fully Scalable Video Coding," IEEE Trans. Circuits Syst. Video Technol., vol. 16，no. 7，pp. 889-895，2006 (下稱先前技術 3)曾 l 提出一種快速模式判斷演算法，用以計算不同編碼層間關係，藉以加速編碼器之運作。於[先前技術1]中’粗略可調性加強層的模式判斷之設計’爲在加強層的宏塊具有較其於基本層的對應物更細的切割。此外’粗略可調性加強層的內部預測強制設定爲 Intra4x4模式或IntraBL模式中之一。相似原埋進一步用以額外複用每一基本層宏塊的移動資訊及碼率失真損失以達成空間可調性’如[先ffii技術2】。並且’於[先前技術3]中，將基本層的移動活性及時間水平做爲不同時間層間模式選擇的背景。然而，所有 6 200910967 此等方案皆僅針對一粗略可調性/空間加強層與一參照畫面所設計，而未臻理想。由於加強層IntraBL與Intra4x4模式的平均百分率已逾90%，因而Fast Mode Decision for Coarse Grain SNR Scalable Video Coding,” IEEE ICASSP, 2006 (hereafter referred to as prior art 1) 'H. Li, Z.-G. Li, and C. Wen, Fast Mode Decision for Spatial Scalable Video Coding,&quot IEEE IS CAS, 2006 (hereafter referred to as prior art 2), H.L i,Z . - G · L i, and C. Wen, Fast Mode Decision Algorithm for Inter-Frame Coding in Fully Scalable Video Coding, " IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 7, pp. 889-895, 2006 (hereafter referred to as prior art 3) has proposed a fast mode decision algorithm for calculating the relationship between different coding layers. Accelerating the operation of the encoder. In [Prior Art 1], the design of the mode judgment of the coarsely adjustable enhancement layer is such that the macroblock in the enhancement layer has a finer cut than the counterpart of the base layer. The internal prediction of the tunable enhancement layer is forcibly set to one of the Intra4x4 mode or the IntraBL mode. The similar burial is further used to additionally multiplex the mobile information and rate loss loss of each base layer macroblock to achieve spatial tunability' Such as [first ffii technology 2 And in [Prior Art 3], the movement activity and time level of the base layer are used as the background for different temporal layer mode selection. However, all 6 200910967 are only for a coarse adjustability/space enhancement layer. Designed with a reference picture, but not ideal. Since the average percentage of the enhancement layer IntraBL and Intra4x4 modes is over 90%,

L. Xiong, “ Reducing Enhancement Layer Directional Intra Prediction Modes," ISO/IEC JTC1/SC29/WG11 及 ITU-T SG16 Q.6, JVT-P041，2005 [下稱先前技術 4]以及 Y. Libo，C. Ying, Z. Jiefu，與 Z. Feng, "Low Complexity Intra Prediction for Enhancement Layer，” ISO/IEC JTC1 /SC29/WG11 及 ITU-T SG16 Q.6, JVT-Q0 8 4, 2005 [下稱先前技術 5]兩種方案所提出的Intral6xl6與Intra8x8 預測模式並未能提供可觀的編碼增益。不需Intra8x8或Intra 16x1 6 模式，IntraBL與Intra4x4模式即可理想保存內部預測的準確度。因此，移除Intra8x8與Intral6xl6模式可在不影響效能的情況下減少內部編碼的高度計算負荷。此外，In tra4x4型態的預測模式包含9種不同方向，因而使得計算複雜度仍爲過高。故於[先前技術 4]中，僅使用3組方向的預測模式（DC、水平與垂直）進行加強層內部預測。其有助於在保有一定編碼效能之同時大幅減少計算複雜度。然而，此一方案未能適當關閉某（些）定義於SVC編碼標準之預測型態’並僅就若干加強層的固定預測方向加以測試。因此，若候選模式能適應視訊序列改變則將可更爲理想。於 D. Alfonso, SVC Low-complexity Macroblock Mode Decision," ISO/IEC JTC1 /SC29/WG11 與 IT U - T S G1 6 Q. 6, 0Ί 9 , 2Q01 [飞稱先前技術6]中，ISVM編碼器在移動估測和宏塊模式判斷兩方面採用Lagrangian率失真優化以同時達成最佳壓縮效能與高度計算 7 200910967 負載。藉由將使用於參照JMH.264/AVC編碼器中相同之低複雜度宏塊模式判斷方式包含於JSVM編碼器中，在基本層之編碼過程中於非移動估測工作上可取得平均介於4 4 %至5 4 %的時間節約’此時平均壓縮損失介於4.5%至6.5%，而平均Y-PSNR 損失在0.5 dB 以下。然而，此等方案僅侷限於SVC編碼器基本層之實現。 HHI所提出的可調性視訊編碼（SVC)技術標準，T. Wiegand, G. Sullivan, J. Reichel, H. Schwarz,與 M. Wien, “Joint Draft 9 of SVC Amendment,” ISO/IEC JTC1/SC29/WG11 與 ITU-T SG16 Q.6, JVT-V201，2007 [下稱先前技術7]現已於ISO與ITU聯合委員會接近完成階段。其基本設計槪念係在擴展最先進之壓縮標準一 H.264/AVC，並複用其大部分之創新元件。Joint Scalable Video Model (JSVM)，J. Reichel, H_ Schwarz,與 M. Wien, Joint Scalable Video Model JSVM-9,” ISO/IEC JTCUSC29/WG11 and ΤΤίΖ-Γ CM JVT-V202，2007 [下稱先前技術8]，其參照編碼器架構所編碼出之位元流（Bit-stream) ’可同時提供空間、時間與 SNR可調性之機能。時間可調性在封閉迴路架構中可藉由B級預測加以實現；而空間及SNR可調性係經由分層法之使用而達成。在此草案中，IS VM編碼器採用基於率失真優化（RD0)之窮舉捜索技術爲每一宏塊（MB)選擇最佳編碼模式。針對每一宏塊，SVC內 RDcost的計算皆必須執行整數轉換、量化、逆量化、逆整數轉換和平均訊息量（Entropy)編碼的正向和反向程序。雖然率失真優化 (RD0)技術可達成最大可能性的編碼效能’】SVM編碼器的編碼-解碼作業複雜度實爲過高而難以發揮實際應用之效。因此’ 一種可 8 200910967 減低SVC計算複雜度，卻不致造成可觀編碼損失的演算法實爲吾人之所冀。L. Xiong, “Reducing Enhancement Layer Directional Intra Prediction Modes," ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-P041, 2005 [hereafter referred to as prior art 4] and Y. Libo, C Ying, Z. Jiefu, and Z. Feng, "Low Complexity Intra Prediction for Enhancement Layer," ISO/IEC JTC1 /SC29/WG11 and ITU-T SG16 Q.6, JVT-Q0 8 4, 2005 [hereinafter Prior Art 5] The Intral 6xl6 and Intra8x8 prediction modes proposed by the two schemes did not provide considerable coding gain. IntraBL and Intra4x4 modes are ideal for saving internal prediction accuracy without the Intra8x8 or Intra 16x1 6 modes. Therefore, removing Intra8x8 and Intral6xl6 modes reduces the high computational load of internal coding without compromising performance. In addition, the Intra 4x4 type of prediction mode contains 9 different directions, which makes the computational complexity still too high. Therefore, in [Prior Art 4], only the prediction modes (DC, horizontal and vertical) of the three sets of directions are used for the intra-hardening layer prediction. It helps to reduce computational complexity while maintaining a certain coding performance. However, this solution fails to properly close some of the predictive patterns defined in the SVC coding standard and test only the fixed prediction directions of several enhancement layers. Therefore, it would be more desirable if the candidate mode could adapt to changes in the video sequence. In D. Alfonso, SVC Low-complexity Macroblock Mode Decision, " ISO/IEC JTC1 /SC29/WG11 and IT U - TS G1 6 Q. 6, 0Ί 9 , 2Q01 [flying prior art 6], ISVM encoder Lagrangian rate-distortion optimization is used in both motion estimation and macroblock mode judgment to achieve the best compression performance and height calculation at the same time. By including the same low complexity macroblock mode judgment method used in the reference JMH.264/AVC encoder in the JSVM encoder, the average value of the non-mobile estimation work can be obtained in the encoding process of the base layer. 4 4% to 54% time savings' At this time, the average compression loss is between 4.5% and 6.5%, while the average Y-PSNR loss is below 0.5 dB. However, such schemes are limited to the implementation of the base layer of the SVC encoder. HHI's Adjustable Video Coding (SVC) Technical Standard, T. Wiegand, G. Sullivan, J. Reichel, H. Schwarz, and M. Wien, “Joint Draft 9 of SVC Amendment,” ISO/IEC JTC1/ SC29/WG11 and ITU-T SG16 Q.6, JVT-V201, 2007 [hereafter referred to as prior art 7] are now nearing completion at the joint committee of ISO and ITU. The basic design commemoration is to extend the most advanced compression standard, H.264/AVC, and reuse most of its innovative components. Joint Scalable Video Model (JSVM), J. Reichel, H_Schwarz, and M. Wien, Joint Scalable Video Model JSVM-9," ISO/IEC JTCUSC29/WG11 and ΤΤίΖ-Γ CM JVT-V202, 2007 [hereinafter referred to as prior art 8], it refers to the bit stream of the encoder architecture (Bit-stream) 'can provide space, time and SNR adjustability at the same time. Time adjustability can be predicted by B-level in closed loop architecture The space and SNR adjustability is achieved through the use of the layering method. In this draft, the IS VM encoder uses the rate-distortion optimization (RD0)-based exhaustive search technique for each macroblock (MB). Select the best coding mode. For each macroblock, the calculation of RDcost in SVC must perform integer conversion, quantization, inverse quantization, inverse integer conversion and entropy coding of forward and reverse programs. Optimization (RD0) technology can achieve the maximum possible coding performance'] The complexity of the SVM encoder's encoding-decoding operation is too high to be effective in practical applications. Therefore, a kind of 8200910967 can reduce the computational complexity of SVC, but Not The algorithm that causes considerable coding loss is what I am guilty of.

於可調性視訊編碼（SVC)技術標準中，針對位於基本層和加強層的每一宏塊（MB)之可用預測模式包含三種內部預測型態（Intra4x4、 Intra8x8 及 Intral6xl6)、七種幀間預測模式（Inter prediction mode) 以及S KI P模式。對於加強層，額外評估採用新的層間預測技術以消除層間關聯。此外，當基本層之對應的宏塊已進行幀內編碼時’ 使用額外的平滑參照預測（M0DE_SR)模式減少基本層內預測之衰退。據此，本發明就多種測試視訊序列進行編碼’並發現單一加強層的計算負載爲一基本層之1.3至2.8倍。這表示JSVM編碼器由於其H.264codec和分層結構太過耗時，因而無法廣泛應用。是故，如何減少所需的編碼時間爲本發明所欲解決之課題。【發明內容】有鑑於上述先前技術之缺失，本發明所提出之改善內容悉述如下： (i)保留在壓縮基本層（Base layer)時所得到之資訊，如：不同切割模式（Partition mode)之最佳參照畫面索引（Reference frame index)、不同切割模式之動作向量（Motion vector)、每個宏塊 (Macroblock)之最佳切割模式、Intra4x4/Intra8x8之最佳內部預測模式（Intra prediction mode)。藉由收集這些基本層之資訊’ 可用來當作壓縮加強層（Enhancement layer)時之參考依據’減少測試模式之個數’進而達到大量減少編碼器之運算複雜度。 (ii)承（i)，不僅如此’其中每個宏塊（Macroblock)之最佳切割模式 200910967In the Adjustable Video Coding (SVC) technical standard, the available prediction modes for each macroblock (MB) located in the base layer and the enhancement layer include three internal prediction modes (Intra4x4, Intra8x8, and Intral6xl6), and seven interframes. Inter prediction mode and S KI P mode. For the enhancement layer, additional evaluation uses new inter-layer prediction techniques to eliminate inter-layer correlation. Furthermore, when the corresponding macroblock of the base layer has been intra coded, the use of the additional smooth reference prediction (M0DE_SR) mode reduces the degradation of the base layer prediction. Accordingly, the present invention encodes a plurality of test video sequences and finds that the computational load of a single enhancement layer is 1.3 to 2.8 times that of a base layer. This means that JSVM encoders are too widely used due to their H.264 codec and hierarchical structure. Therefore, how to reduce the required coding time is the subject of the present invention. SUMMARY OF THE INVENTION In view of the above-mentioned prior art, the improvements proposed by the present invention are as follows: (i) retaining information obtained when compressing a base layer, such as: different partition mode (Partition mode) The best reference frame index, the motion vector for different cutting modes, the best cutting mode for each macroblock, and the best intra prediction mode for Intra4x4/Intra8x8. . By collecting the information of these basic layers, it can be used as a reference for compressing the enhancement layer to reduce the number of test modes, thereby achieving a large reduction in the computational complexity of the encoder. (ii) inheritance (i), not only that 'the best cutting mode for each macroblock (Macroblock) 200910967

與 Intra4x4/Intra8x8 之最佳內部預測模式（Intra prediction mode) 兩者之資訊會隨著每一個加強層壓縮完畢而作更新，意即該層加強層會參照前一層之每個宏塊（Mac rob lock)之最佳切割模式與 Intra4x4/Intra8x8 之最佳內部預測模式（Intra prediction mode) 爲其參考依據，而不是永遠以基本層之資訊爲其參考依據，進而使其參考之依據準確度越來越精確。另外，本發明案也提出了測試模式查詢表與分層適應Intra4X4/Intra8x8預測方向判斷方式，以供加強層中每個宏塊（Macroblock)可以參照前一層相對應之宏塊的最佳切割模式與其最佳內部預測模式，藉由查詢表與搭配內部預測方向判斷方式告知所需測試之預測模式進行運算即可，用以減少在決定最佳預測模式時之龐大運算量。 (iii) 在編碼加強層時，本發明案提出了一個較佳的動作向量起始尋找點（Initial search point)之設定方式，對於大方塊之切割模式（如：16x16、16x8、8x16與8x8)，其起始找尋點用基本層所得到之動作向量爲動作向量預測點（Μ 〇 U ο n v e c t 〇 r p r e d i c t 〇 r ) 會比較適當；而對於小方塊之切割模式（如：8x4、4x8與4x4)，其動作向量預測點即用原本SVC之演算法即可。也因爲加強層有了較好之動作向量預測點，因此動作向量所收尋的範圍可以比基本層來的小，以節省運算複雜度。再者，加強層之宏塊在執行捜尋動作向量時所使用之參照畫面索引很可能與其於基本層之對應物具有相同之參照畫面索引，因此加強層中只需要在具有相同之參照畫面索引之參照畫面捜索最佳之動作向量即可。但是當基本層以較低位元率編碼，且宏塊選擇切割尺寸 10 200910967 爲16x16時，則使用窮舉搜索。 (IV) 本發明案所提之快速演算法只實現在加強層中的非鑰匙畫面（Non-key picture)，因此編碼效率的損失只會存在於各組群畫面（Group of pictures, GOP)之中，意即各組群畫面中的編碼效能損失不會影響到其他組群畫面，因此可以非常穩定地控制因編碼效能的些微損失所造成的碼率（Bit-rate)增加。據實驗模擬結果指出，平均的碼率增加在1 %以內，而PSNR的平均下降幅度在O.OldB以內。The information with the Intra prediction mode of the Intra4x4/Intra8x8 is updated as each enhancement layer is compressed, meaning that the enhancement layer will refer to each macroblock of the previous layer (Mac rob). The optimal cutting mode of lock) and the Intra prediction mode of Intra4x4/Intra8x8 are used as the reference basis, instead of always using the information of the basic layer as the reference basis, so that the reference accuracy of the reference is more and more The more precise. In addition, the present invention also proposes a test mode lookup table and a layered adaptation Intra4X4/Intra8x8 prediction direction judgment mode, so that each macroblock (Macroblock) in the enhancement layer can refer to the optimal cutting mode of the corresponding macroblock of the previous layer. Compared with the best internal prediction mode, the query mode and the internal prediction direction judgment mode are used to inform the prediction mode of the required test, and the calculation amount is reduced to reduce the huge calculation amount when determining the optimal prediction mode. (iii) In coding the enhancement layer, the present invention proposes a preferred action vector setting of the initial search point for the cutting mode of the large square (eg, 16x16, 16x8, 8x16, and 8x8). It is appropriate to use the motion vector obtained by the base layer as the motion vector prediction point (Μ 〇U ο nvect 〇rpredict 〇r ); and for the small square cutting mode (eg 8x4, 4x8 and 4x4) The motion vector prediction point can be used by the original SVC algorithm. Also because the enhancement layer has a better motion vector prediction point, the motion vector can be found in a smaller range than the base layer to save computational complexity. Furthermore, the reference picture index used by the enhancement layer macroblock in performing the seek motion vector is likely to have the same reference picture index as its counterpart of the base layer, so only the same reference picture index needs to be in the enhancement layer. The reference picture can be used to search for the best motion vector. However, when the base layer is encoded at a lower bit rate and the macroblock selects the cut size 10 200910967 to be 16x16, an exhaustive search is used. (IV) The fast algorithm proposed in the present invention implements only a non-key picture in the enhancement layer, so the loss of coding efficiency only exists in each group of pictures (GOP). In the mean, the loss of coding performance in each group picture does not affect other group pictures, so the bit rate increase caused by the slight loss of coding efficiency can be controlled very stably. According to the experimental simulation results, the average code rate is increased by less than 1%, and the average decrease of PSNR is within O.OldB.

(v)本發明案所提之演算法，可根據基本層（Base layer)之畫質好壞，而隨機調整所要參照之預測模式集合（Candidate mode set)。另外，當基本層（Base layer)之畫質太差時，所提之演算法就不執行殘差預測（Residual prediction)之機制，而達到減少運算量之目的。簡言之，本發明所提出之快速演算法提供一種分層適應模式判斷演算法與一針對具有粗略可調性（C G S )與時間可調性之可調性視訊編碼（SVC)運作之移動搜索方案。本發明之另一目的，係爲加快編碼器速度且最小化編碼效能損失，針對編碼層間的計算繁冗性加以考量，根據宏塊（MB)編碼模式及參照/基本層的量化參數 (Qp)，遞歸性地使用一對照表以判定加強層上欲測試的宏塊模式。此外，爲避免完全移動估測，本發明提出之方法適應性地複用基本層的參照畫面索引，並且根據加強層的宏塊切割，由基本層的移動向量或加強層的移動向量預測中妥善選擇移動估測的搜索起始點，成功增進演算效能。 11 200910967 【實施方式】爲使在此領域中a通常知識者能夠瞭解本發明所提出之具體作法與功效，於此特提出一較佳實施例，分述如下： 1.基本層與加強層間之關聯說明於此，本發明先著重於結合粗略可調性（C G S )與時間可調性之配置就編碼層之間的關聯性加以分析。並基於 CIF解析度中一基本層和一粗略可調性加強層之編碼進行統計分析。簡言之，以下之 QpB、QpE及Qp。分別代表基本層、加強層以及參照層之量化參數，據以預測出加強層》表一、於加強層之幀内編碼宏塊模式之條件概率加強層基本層之幀内預测Μ式 (QPb-QPe) =(39.29) (QPb-QPe) = (27.17) 16x16 | 16x8 8x16 8x8 16x16 16x8 8x16 8x8 16x16 0.44 ! 0.25 0.28 0.04 0.19 0.1S 0.15 0.10 16x8 0.19 1 0.25 0.12 0.04 0.16 0.18 0.07 0.05 8x16 0.20 j 0.12 0.29 0.04 0.17 0.08 0.17 0.05 8x8 0.17 ! 0.38 0.31 0.88 0.4S 0.56 0.61 0.80 加強層之宏塊切割與基本層之宏塊切割具有相似的視頻特性，因而亦有高度關聯性。如表一所示，基本層之切割爲加強層切割之理想預測參考値。舉例而言，若基本層之宏塊係以1 6 X 8 / 8 X 1 6 切割編碼，則其於加強層之對應物將不太會具有8x16/16x8之長寬比。此外，當基本層使用 8 X 8切割以取得較佳預測時，加強層亦 12 200910967 得因使用相同切割而獲益。並且，加強層品質的提昇亦將導致8 x 8 切割百分率的增加。內部預測模式的分佈與宏塊切割一樣，亦高度仰賴基本層與加強層的品質。請見第一圖，爲加強層內部預測模式之分佈圖，於第一圖中可見，當基本層使用一較小量化係數而爲較佳品質的編碼時，大部分內部預測係以與基本層內部編碼宏塊有關之 IntraBL爲基礎。另一方面，隨著加強層品質的逐漸改善，內部預(v) The algorithm proposed by the present invention can randomly adjust the Candidate mode set to be referred to according to the quality of the base layer. In addition, when the picture quality of the base layer is too poor, the proposed algorithm does not perform the mechanism of residual prediction, and achieves the purpose of reducing the amount of calculation. In short, the fast algorithm proposed by the present invention provides a hierarchical adaptive mode judgment algorithm and a mobile search for adjustable video coding (SVC) operation with coarse adjustability (CGS) and time adjustability. Program. Another object of the present invention is to speed up the encoder speed and minimize the loss of coding performance, and to consider the computational complexity between coding layers, according to the macroblock (MB) coding mode and the reference/base layer quantization parameter (Qp), A look-up table is used recursively to determine the macroblock mode to be tested on the enhancement layer. Furthermore, in order to avoid full motion estimation, the method proposed by the present invention adaptively multiplexes the reference picture index of the base layer, and according to the macroblock cut of the enhancement layer, is properly predicted by the motion vector of the base layer or the motion vector of the enhancement layer. Selecting the search starting point of the mobile estimation successfully improves the calculation efficiency. 11 200910967 [Embodiment] In order to enable a person skilled in the art to understand the specific practices and effects of the present invention, a preferred embodiment is proposed, which is described as follows: 1. Between the base layer and the reinforcement layer Description of the Association Here, the present invention focuses on the analysis of the correlation between coding layers in combination with the configuration of coarse tunability (CGS) and time tunability. The statistical analysis is based on the coding of a basic layer and a coarsely adjustable enhancement layer in the CIF resolution. In short, the following QpB, QpE and Qp. Representing the quantization parameters of the base layer, the enhancement layer, and the reference layer, respectively, according to the prediction of the enhancement layer. Table 1. The intra-predictive mode of the conditional enhancement layer of the intra-coded macroblock mode of the enhancement layer (QPb) -QPe) =(39.29) (QPb-QPe) = (27.17) 16x16 | 16x8 8x16 8x8 16x16 16x8 8x16 8x8 16x16 0.44 ! 0.25 0.28 0.04 0.19 0.1S 0.15 0.10 16x8 0.19 1 0.25 0.12 0.04 0.16 0.18 0.07 0.05 8x16 0.20 j 0.12 0.29 0.04 0.17 0.08 0.17 0.05 8x8 0.17 ! 0.38 0.31 0.88 0.4S 0.56 0.61 0.80 The macroblock cutting of the enhancement layer has similar video characteristics to the macroblock cutting of the base layer and is therefore highly correlated. As shown in Table 1, the cutting of the base layer is an ideal predictive reference for the reinforcement layer cutting. For example, if the macroblock of the base layer is coded with 1 6 X 8 / 8 X 1 6 , its counterpart at the enhancement layer will not have an aspect ratio of 8x16/16x8. In addition, when the base layer uses 8 X 8 cuts to achieve better predictions, the reinforcement layer also benefits from the use of the same cut. Also, an increase in the quality of the reinforcement layer will result in an increase in the percentage of 8 x 8 cuts. The distribution of the internal prediction mode is the same as the macroblock cut, and it also depends on the quality of the base layer and the enhancement layer. See the first figure, which is the distribution map of the internal prediction mode of the enhancement layer. As can be seen from the first figure, when the basic layer uses a smaller quantization coefficient and is a better quality coding, most of the internal prediction is based on the basic layer. Internal coded macroblocks are based on IntraBL. On the other hand, with the gradual improvement of the quality of the reinforcement layer, internal pre-

測由基本層轉向加強層。尤其，爲了更佳之預測效果，Intra4x4 的百分率相較於其他兩種模式有更顯著的增加。此外，爲減少Intra4x4/Intra8x8之候選方向模式，在此分析基本層與加強層最佳方向之相似性。於第二（a)圖中，若加強層之最佳模式等於基本層之最佳模式、其二相鄰模式，或D C模式，則基本層與加強層之最佳模式具有高度關聯性。例如，若基本層之最佳模式爲垂直模式（模式0)，則二相鄰模式爲模式5與模式7。如第二（1))圖所示，超過70%或80%宏塊於11^^414/1111^818預測型態具有相似之方向，並不受QpB與QpE差異之影響。基本層與加強層之移動向量亦有相當程度之關聯性，例如於第三U)圖中，當宏塊之切割尺寸大於8x8時，基本層之移動向量可做爲加強層移動向量之良好預測參考値。另一方面，第三（b)圖顯示加強層之移動向量預測參考値爲一子宏塊切割之更佳選擇。藉由適應性地選擇預測參考値之一做爲捜索起始點，可因而縮小加強層中之捜索範圍。同樣地，統計分析亦顯示加強層之宏塊很可能與其於基本層之對應物具有相同之參照畫面索引。但其中一 13 200910967 種例外的情況是當基本層以較低位元率編碼，且宏塊選擇切割尺寸爲1 6 X 1 6時，則基本層之參照畫面索引可能無法成爲運用於加強層的可靠的資料。爲進行層間殘差預測，於基本層之殘差未量化爲零時進行測試。在第四（a)圖中，當基本層之品質藉由降低量化參數而改善時，測試層間殘差預測的槪率顯著增加。然而，由第四（b)圖的條件槪率可見，接受層間殘差預測測試之宏塊中僅有半數確實以殘差預測編碼。此外，根據加強層的品質，該等宏塊中較大百分比可能以複用基本層移動資訊之 BLSkip 模式加以編碼。因此，編碼時間可望藉由適當進行層間殘差預測之測試而大幅縮短。表二、分層適應模式判斷之對照表The measurement is turned from the base layer to the reinforcement layer. In particular, for better predictive results, the percentage of Intra4x4 is significantly higher than the other two modes. In addition, in order to reduce the candidate direction mode of Intra4x4/Intra8x8, the similarity between the basic layer and the optimal direction of the enhancement layer is analyzed here. In the second (a) diagram, if the best mode of the enhancement layer is equal to the best mode of the base layer, its two adjacent modes, or the D C mode, the base layer is highly correlated with the best mode of the enhancement layer. For example, if the best mode of the base layer is vertical mode (mode 0), then the two adjacent modes are mode 5 and mode 7. As shown in the second (1)), more than 70% or 80% of the macroblocks have similar directions in the 11^^414/1111^818 prediction type, and are not affected by the difference between QpB and QpE. The movement vectors of the base layer and the enhancement layer are also highly correlated. For example, in the third U) diagram, when the cut size of the macroblock is larger than 8x8, the motion vector of the base layer can be used as a good prediction of the enhancement layer motion vector. Reference 値. On the other hand, the third (b) diagram shows that the motion vector prediction reference of the enhancement layer is a better choice for a sub-macroblock cut. By adaptively selecting one of the predicted reference frames as the starting point of the search, the range of the search in the enhancement layer can be reduced. Similarly, the statistical analysis also shows that the macroblocks of the enhancement layer are likely to have the same reference picture index as their counterparts to the base layer. However, one of the 13 200910967 exceptions is when the base layer is encoded at a lower bit rate and the macroblock selection cut size is 16 6 1 6 , then the reference picture index of the base layer may not be applied to the enhancement layer. Reliable information. For inter-layer residual prediction, the test is performed when the residual of the base layer is not quantized to zero. In the fourth (a) diagram, when the quality of the base layer is improved by reducing the quantization parameter, the probability of predicting residual error between layers is significantly increased. However, as can be seen from the conditional rate of the fourth (b) graph, only half of the macroblocks subjected to the inter-layer residual prediction test are indeed encoded with residual prediction. In addition, depending on the quality of the enhancement layer, a larger percentage of the macroblocks may be encoded in a BLSkip mode that multiplexes the base layer movement information. Therefore, the encoding time is expected to be greatly shortened by appropriately testing the inter-layer residual prediction. Table 2, comparison table of stratified adaptation mode judgment

具有Qp〇之Mf »之最tta測椟式候選棋式 1 區：Qp〇柃 0-33 2 區：Ορο 於 34^51 幀内預測棋式内部飧内預測棋式内部直接 16x16 16x8 8x16 Sx£ 直接 16κ16 16xS 8x16 Sx8 直接 Ο Ο Ο Ο Ο ο ο Ο Ο Ο Ο ο 16x16 ο ο ο ο ο ο 16x8 ο ο ο ο ο ο 8x16 ο ο ο ο ο ο 8x8 ο ο ο ο ο ο ο BLskip Ο ο ο ο ο ο ο ο ο ο Directs ο ο ο ο ο ο ο ο ο ο ο ®X®RES ο ο ο BLskipjcEs ο ο ο ο ο MODE-SR ο ο ο ο ο IntraSxS ο ο ο ο ο ο ο ο ο ο ο ο lBtra4x4 ο ο ο ο ο ο ο ο ο ο ο ο IntraBL ο ο 8x8預琍棋式包含8x8、8x4、4x8及4x4 «測棋式* 下標RES表示該預測棋式昇有殘差《測·> 14 200910967 2.本發明案模式判斷演算法基於前段之事實，本發明進而提出一種分層適應模式判斷演算法及一種結合粗略可調性與時間可調性之移動捜索方法。如第五圖所示，爲利用層間關聯性，基本層係以窮舉捜索進行編碼且將每種可能組合之所有移動資訊均加以保留。之後可藉由參照量化參數及參照層之編碼模式於表二中對照而得欲於一加強層進行測試之編碼模式。此外，第六圖中的演算法依據切割尺寸決定移動捜Mf » with Qp〇's most tta test candidate type 1 area: Qp〇柃0-33 2 area: Ορο in 34^51 Intra-predictive chess internal internal prediction chess type internal direct 16x16 16x8 8x16 Sx£ Direct 16κ16 16xS 8x16 Sx8 directly Ο Ο Ο Ο ο ο ο Ο 16 ο ο 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 8x8 ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο l Btra4x4 ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο <<Measure·> 14 200910967 2. The present invention mode judgment algorithm is based on the facts of the previous paragraph, and the present invention further proposes a layered adaptive mode judgment algorithm and a combination of coarse adjustability and time adjustable The mobile mobile search method. As shown in the fifth figure, in order to utilize the inter-layer correlation, the basic layer encodes with exhaustive search and retains all mobile information for each possible combination. The coding mode to be tested by a reinforcement layer can then be obtained by referring to the quantization parameters and the coding mode of the reference layer in Table 2. In addition, the algorithm in the sixth figure determines the movement based on the size of the cut.

索之參照畫面；起始點係由基本層之移動向量或加強層之移動向量預測參考値適應性地選得。其中由 ^ = 0 表示來自參照列表 0 (Li stO)之預測爲關閉；同樣地，^ = 0表示來自參照列表1 (List 1) 之預測爲關閉。 2.1分層適應模式判斷爲大幅縮短編碼時間，同時最小化編碼效能損失，表二中之分層適應模式選擇之設計係基於層間關聯性。例如，當參照層係利用介於〇 4 ~ 5 1 )範圍之量化參數編碼，因爲參照層爲低品質，層間殘差預測將被略過。此外，如分析結果所指出，當一宏塊係以 16x8/8x16 切割編碼，其位於加強層之對應物將不會以8x16/16x8 之切割進行估測。並且，當一宏塊係以8 X 8切割編碼，其位於加強層之對應物將亦不會被以任何具有大於8x8切割尺寸之模式測試。爲更進一步改良，加強層之內部預測強制爲Intra4x4、Intra8x8 或IntraBL。此外，如第七圖所示，除卻DC模式爲參照層之最佳 15 200910967 模式外，僅有I n t r a 4 χ 4與I n t r a 8 X 8 預測型態之四種模式於加強層受到測試。另一方面，當參照層係以較佳品質編碼時，相較於前述者之主要改變包含： (1)所有具有層間殘差預測之模式均用於測試，以及；The reference picture of the cable; the starting point is adaptively selected by the motion vector of the base layer or the motion vector prediction reference of the enhancement layer. Where ^ = 0 indicates that the prediction from reference list 0 (Li stO) is off; likewise, ^ = 0 indicates that the prediction from reference list 1 (List 1) is off. 2.1 Hierarchical adaptation mode judgment In order to greatly shorten the coding time while minimizing the coding performance loss, the design of the hierarchical adaptive mode selection in Table 2 is based on the inter-layer correlation. For example, when the reference layer is coded using quantization parameters ranging from 〇 4 to 5 1 ), since the reference layer is of low quality, the inter-layer residual prediction will be skipped. In addition, as indicated by the analysis results, when a macroblock is coded at 16x8/8x16, its counterpart at the enhancement layer will not be estimated with a cut of 8x16/16x8. Also, when a macroblock is coded at 8 x 8 , its counterpart on the enhancement layer will not be tested in any mode having a cut size greater than 8x8. For further improvement, the internal prediction of the enhancement layer is forced to Intra4x4, Intra8x8 or IntraBL. In addition, as shown in the seventh figure, except for the DC mode being the best of the reference layer 15 200910967 mode, only the four modes of the I n t r a 4 χ 4 and the I n t r a 8 X 8 prediction type are tested at the enhancement layer. On the other hand, when the reference layer is encoded with better quality, the main changes compared to the foregoing include: (1) all modes with inter-layer residual prediction are used for testing, and;

(2 )當一參照層宏塊以1 6 X 8、8 X 1 6或8 X 8編碼時，僅有8 X 8預測模式及具有相同切割之模式受到測試。於後者中，雖然表一指出此等設計在模式分佈方面可能並非最爲理想，但實驗結果顯示以 16x8或8x16切割替換16x16切割對編碼效能之影響微不足道，尤其當加強層係以較高品質編碼時》 2.2分層適應移動搜索類推至模式選擇，第六圖之分層適應移動捜索係藉由利用基本層移動資訊之設計，從而避免於加強層之窮舉移動捜索。當欲以非 1 6 X 1 6之預測模式測試加強層之一宏塊時，複用位於基本層之相關參照索引。然而，於16x16預測模式中，當參照層之編碼品質較 I 差時，表示位於基本層的參照畫面索引無法可靠用於加強層，因而此時仍必須進行窮舉捜索。本發明係經使用一基本層、三粗略可調層、三參照畫面及8和16 組群畫面尺寸於CIF及4CIF解析之標準序列測試。相較於ISVM8 的模式判斷演算法，本發明於整體編碼時間平均提供7 6 %之改進，且平均僅增加1%以下之位元傳輸率及O.OldB以下之Y-PSNR損失。綜上所述，本發明具有下列優點： 1. 所提出之演算法可以使用在編碼器之特性包含多粗略可調編 16 200910967 碼層（Multiple CGS layers)且其參考畫面可以是多張畫面 (Multiple reference frames)時，充分地利用基本層（Base layer) 所得到之資訊，如宏塊切割模式（MBpartitionmode)、動作向量 (Motion vector)、最佳參考畫面索引（Reference frame index)與最佳內部預測模式（Intra prediction mode)等等，以達到大量降低編碼端之運算複雜度之目的並且接近原來之編碼效率。 2.所提出之演算法仍保留In tra8x 8之預測模組，並且會隨著局部影像之特性而改變測試不同方向之預測模式。 3 .所提出之演算法，不論在編碼時間的節省與編碼效率的損失兩方面比較下，均比[先前技術6]所提之方式要來的好。另外，[先前技術6]所提之演算法只侷限在基本層之實現，尙未提出應用在多層編碼時之方式，在某些測試影像中，亦會產生明顯之編碼效率損失。本發明已藉上述較佳實施例加以說明，以上所述者，僅爲本發明之較佳實施例，並非用來限定本發明實施之範圍。凡依本發明申請專利範圍所述之技術特徵及精神所爲之均等變化與修飾，均應包含於本發明之申請專利範圍內。【圖式簡單說明】第1圖爲加強層內部預測模式之分佈圖。第2圖爲（a) Intra4x4/Intra8x8預測方向（參照用）（b) 基本層與加強層間內部預測方向之關聯性。第3圖爲移動向量差異以（a)基本層之移動向量（b)加強層移動 17 200910967 向量預測參考値爲參考。第 4圖爲（a )測試殘差預測之槪率（b )使用殘差預測之條件槪率。第5圖爲分層適應模式判斷流程圖。第6圖爲參照畫面索引之分層適應選擇流程圖。第7圖爲分層適應Intra4x4/Intra8x8預測方向判斷圖。【主要元件符號說明】(2) When a reference layer macroblock is encoded in 1 6 X 8, 8 X 1 6 or 8 X 8 , only the 8 X 8 prediction mode and the mode with the same cut are tested. In the latter case, although Table 1 indicates that these designs may not be optimal in terms of mode distribution, the experimental results show that the effect of replacing 16x16 cuts with 16x8 or 8x16 cuts on coding performance is negligible, especially when the enhancement layer is coded at a higher quality. Time 2.2 The layered adaptive mobile search analog to mode selection, the sixth layer of the layered adaptive mobile search system by using the basic layer of mobile information design, thereby avoiding the exhaustive mobile search. When one of the enhancement layer macroblocks is to be tested in a prediction mode other than 16 6 16 , the associated reference index located in the base layer is multiplexed. However, in the 16x16 prediction mode, when the coding quality of the reference layer is worse than I, it means that the reference picture index located at the base layer cannot be reliably used for the enhancement layer, and therefore it is necessary to perform an exhaustive search at this time. The present invention is tested in a standard sequence using CIF and 4CIF parsing using a base layer, three coarsely adjustable layers, three reference pictures, and 8 and 16 group picture sizes. Compared with the mode judgment algorithm of ISVM8, the present invention provides an improvement of 76% on the overall coding time, and only increases the bit transmission rate below 1% and the Y-PSNR loss below O.OldB. In summary, the present invention has the following advantages: 1. The proposed algorithm can be used in the characteristics of the encoder to include multiple coarsely tunable 16 200910967 code layers (Multiple CGS layers) and its reference picture can be multiple pictures ( Multiple reference frames), make full use of the information obtained by the base layer, such as MB blockition mode (MBpartitionmode), motion vector (Motion vector), the best reference frame index (Reference frame index) and the best internal Intra prediction mode, etc., to achieve a large reduction in the computational complexity of the encoding end and close to the original encoding efficiency. 2. The proposed algorithm still retains the prediction module of In tra8x 8, and changes the prediction mode in different directions according to the characteristics of the local image. 3. The proposed algorithm, both in terms of coding time savings and coding efficiency loss, is better than the method proposed in [Prior Art 6]. In addition, the algorithm proposed in [Prior Art 6] is limited to the implementation of the base layer, and the method of multi-layer coding is not proposed, and in some test images, significant coding efficiency loss is also generated. The present invention has been described in terms of the preferred embodiments described above, and is not intended to limit the scope of the invention. Equivalent changes and modifications of the technical features and spirits described in the claims are intended to be included in the scope of the invention. [Simple description of the diagram] Figure 1 is a distribution diagram of the internal prediction mode of the enhancement layer. Fig. 2 shows (a) Intra4x4/Intra8x8 prediction direction (for reference) (b) Correlation between the basic layer and the internal prediction direction between the reinforcement layers. Figure 3 is the difference of the motion vector with (a) the motion vector of the base layer (b) the enhancement layer movement 17 200910967 The vector prediction reference 値 is used as a reference. Figure 4 shows (a) the rate of test residual prediction (b) the conditional rate of prediction using residuals. Figure 5 is a flow chart for determining the layered adaptation mode. Figure 6 is a flow chart of hierarchical adaptation selection with reference to the picture index. Figure 7 is a hierarchical adaptive Intra4x4/Intra8x8 prediction direction judgment diagram. [Main component symbol description]

1818

Claims

200910967 X. Application for Patent Park: 1. A low-complexity macroblock mode decision and dynamic vector estimation fast algorithm for coarse and time-adjustable video coding, characterized by: Reserved in the compressed base layer (Base layer) The information obtained at the time, by collecting the information of these basic layers, is used as a reference for compressing the enhancement layer, reducing the number of test modes and reducing the range of motion vectors, thereby achieving a large reduction in coding. The computational complexity of the device. f 2 . The low complexity macroblock mode decision and dynamic vector estimation fast algorithm applicable to coarse and time adjustable video coding according to item 1 of the patent application scope, wherein the compressed base layer is obtained The information is: the best reference frame for different partition modes (Reference frame index), the motion vector for different cutting modes, the best cutting mode for each macroblock (Macroblock), Intra4x4/ Intra prediction mode of Intra8x8. 3. For the low-complexity macroblock mode decision and dynamic vector estimation fast algorithm for coarse and time-adjustable video coding, as in the second paragraph of the patent application, each macroblock (M acr 〇b 1 〇ck The optimal inter prediction mode/inte prediction mode and the Intra prediction mode of the Intra4x4/Intra8x8 are updated as each enhancement layer is compressed, ie This layer of enhancement layer will refer to the best cutting mode of each macro block (Macroblock) in the previous layer and the Intra prediction mode of Intra4x4nntra8x8 as the reference basis, so as to avoid using the information of the basic layer forever. And then make its reference based on the accuracy of 19 200910967 accurate. 4. A test pattern lookup table for low complexity macroblock mode decision and dynamic vector estimation fast algorithm for coarse and time adjustable video coding, characterized in that the table uses the patent application scopes 1 to 3 The method described in the item, for each macroblock (Macroblock) in the enhancement layer, can refer to the optimal cutting mode of the corresponding macroblock of the previous layer and its optimal internal prediction mode, and inform the prediction mode of the required test by using a lookup table. To perform the operation. 5. As applied in the third paragraph of the patent application, r and R, which are suitable for coarse and time-adjustable video coding; low-complexity macroblock mode decision and dynamic vector estimation fast algorithm, where the motion vector is encoded when the enhancement layer is coded The 寻找nitial search point is set as follows: For the large square cut mode (eg, 16x16, 16x8, 8x16, and 8x8), the motion vector obtained by the base layer is the motion vector prediction. Motion vector predictor; for small square cutting modes (eg 8x4' 4x8 and 4x4), the motion vector prediction point uses the SVC algorithm; thus the enhancement layer has a better motion vector prediction point, so The range of the motion vector can be smaller than the base layer to save computational complexity. In addition, the reference layer index used by the enhancement layer macroblock in performing the seek motion vector can directly use the corresponding base layer. The object has a reference picture index, but when the base layer is encoded at a lower bit rate and the macroblock selects a cut size of 16x16, an exhaustive search is used. 6. The low-complexity macroblock mode decision and the dynamic vector estimation fast algorithm applicable to coarse and time-adjustable video coding according to item 3 of the patent application scope, wherein the fast algorithm only implements the non-key in the enhancement layer Non-key picture, because 20 200910967 This loss of coding efficiency will only exist in each group of pictures (GOP), and the loss of coding efficiency in each group picture will not affect other groups. The picture, so that the bit rate increase due to slight loss of coding efficiency can be stably controlled. 7. A low-complexity macroblock mode decision and a fast motion vector estimation fast algorithm for rough and time-adjustable video coding, as in the sixth application scope of the patent application, wherein the average code due to a slight loss of coding performance The rate of increase (Bit-rate) is within 1%, while the average decrease of PSNR is within O.OldB.

8. The low-complexity macroblock mode decision and dynamic vector estimation fast algorithm applicable to coarse and time-adjustable video coding according to item 5 of the patent application scope, wherein the quality of the base layer can be good or bad. And adaptively adjust the Candidate mode set to be referenced. 9. A low-complexity macroblock mode decision and a fast motion vector estimation fast algorithm for rough and time-adjustable video coding, as in claim 8 of the patent application, wherein when the quality of the base layer is too poor That is, the mechanism of Residual Prediction is not implemented, and the purpose of reducing the amount of calculation is achieved. twenty one