CN1306826C - Loop filter based on multistage parallel pipeline mode - Google Patents
Loop filter based on multistage parallel pipeline mode Download PDFInfo
- Publication number
- CN1306826C CN1306826C CNB2004100702068A CN200410070206A CN1306826C CN 1306826 C CN1306826 C CN 1306826C CN B2004100702068 A CNB2004100702068 A CN B2004100702068A CN 200410070206 A CN200410070206 A CN 200410070206A CN 1306826 C CN1306826 C CN 1306826C
- Authority
- CN
- China
- Prior art keywords
- filtering
- boundary
- data
- vertical
- macroblock
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Processing (AREA)
Abstract
本发明涉及一种基于多级并行流水线方式的环路滤波装置,将计算整个宏块边界强度、边界阈值的过程和对整个宏块边界进行滤波的过程分成两级流水线操作;对整个宏块边界滤波的过程按照块级流水线方式依次对各个块数据边界进行滤波;需要滤波的块边界的每行/列数据依次送入一个多级流水垂直/水平滤波器进行滤波,获得滤波后的行/列数据;当前宏块的各个块数据滤波的同时,已经完成滤波的数据进行写外部存储器的操作。本发明利用多级并行流水线完成视频图像各个宏块边界的环路滤波,提高了滤波速度,减少了访问外部存储器的压力,保证了图像编解码过程中图像环路滤波操作的实时性,并且控制了整个硬件结构的复杂性,从而有利于设计和实现。
The invention relates to a loop filtering device based on a multi-stage parallel pipeline mode, which divides the process of calculating the boundary strength and boundary threshold of the entire macroblock and the process of filtering the boundary of the entire macroblock into two-stage pipeline operations; The filtering process sequentially filters the data boundaries of each block according to the block-level pipeline method; each row/column data of the block boundary that needs to be filtered is sequentially sent to a multi-stage pipeline vertical/horizontal filter for filtering, and the filtered row/column is obtained Data; while the data of each block of the current macroblock is filtered, the data that has been filtered is written to the external memory. The invention utilizes multi-stage parallel pipelines to complete the loop filtering of each macroblock boundary of the video image, improves the filtering speed, reduces the pressure of accessing the external memory, ensures the real-time performance of the image loop filtering operation in the process of image encoding and decoding, and controls The complexity of the entire hardware structure is reduced, which is beneficial to design and implementation.
Description
技术领域technical field
本发明涉及一种数字图像的编解码技术,尤其是一种高性能实时视频处理器中的环路滤波器,属于视频编解码技术领域。The invention relates to a coding and decoding technology of digital images, in particular to a loop filter in a high-performance real-time video processor, belonging to the technical field of video coding and decoding.
背景技术Background technique
在国际标准化组织的MPEG和国际电信联盟的h.26x系列的图像编码标准中,采用基于块的运动估计和离散余弦变化,这种编码方法带来的问题是,在数据块的相邻象素间存在边界效应,即块效应。为了减轻图像编解码过程中存在的明显的块效应,提出了许多边界滤波方式,其中最新的国际编码标准MPEG4-part 10/h.264和国内的编码标准AVS都采用了去块效应的环路滤波(deblocking loopfilter)方法,显著的改善了编码图像的主观质量。In the MPEG of the International Organization for Standardization and the h.26x series of image coding standards of the International Telecommunication Union, block-based motion estimation and discrete cosine variation are used. The problem caused by this coding method is that adjacent pixels in the data block There is a boundary effect between them, that is, a block effect. In order to alleviate the obvious block effect in the process of image encoding and decoding, many boundary filtering methods have been proposed, among which the latest international coding standard MPEG4-part 10/h.264 and the domestic coding standard AVS both use the deblocking loop The deblocking loop filter method significantly improves the subjective quality of the encoded image.
如图1所示,为宏块环路滤波流程图,首先计算待环路滤波宏块中各条边界的边界强度和边界阈值,利用计算得到的边界强度和阈值对宏块亮度数据的垂直边界进行滤波,再对宏块亮度数据的水平边界进行滤波,然后再按照先后顺序分别对宏块色度数据的垂直边界、水平边界进行滤波,最后将完成垂直、水平边界滤波后的亮度、色度数据输出。As shown in Figure 1, it is a flow chart of macroblock loop filtering. First, calculate the boundary strength and boundary threshold of each boundary in the macroblock to be loop filtered, and use the calculated boundary strength and threshold to determine the vertical boundary of the macroblock luminance data. Filter, then filter the horizontal boundary of the macroblock luminance data, and then filter the vertical boundary and horizontal boundary of the macroblock chrominance data in sequence, and finally complete the vertical and horizontal boundary filtered brightness and chroma data output.
如图2所示,为块垂直/水平滤波边界图,宏块需要滤波的边界为垂直边界1(vertical edge)及水平边界2(horizontal edge)。As shown in FIG. 2 , it is a block vertical/horizontal filtering boundary map, and the boundaries to be filtered by a macroblock are vertical boundary 1 (vertical edge) and horizontal boundary 2 (horizontal edge).
由于这种环路滤波方式一方面需要对块数据的各个边界上的象素点进行滤波操作,带来很大的运算量,另一方面,在滤波的过程中需要对象素值进行不规则的访问,所以如果采用一般的硬件设计结构,会极大的影响整个滤波的速度,无法完成实际应用当中的图像实时编解码的要求。On the one hand, this loop filtering method needs to perform filtering operations on the pixels on each boundary of the block data, which brings a large amount of calculation; on the other hand, in the process of filtering, irregular pixel values need to be access, so if a general hardware design structure is adopted, it will greatly affect the speed of the entire filtering, and cannot fulfill the requirements of real-time image encoding and decoding in practical applications.
MPEG4-part 10/h.264和AVS标准分别是国际/内的最新视频编解码标准,已经提出的解决上述环路滤波器带来的复杂性的方法非常有限,只有几篇学术论文中对解决这个问题进行了分析和讨论,提出一些实现方式,但有些实现方法没有把环路滤波过程和参考帧存储过程以及运动补偿过程同时考虑,局限于实现环路滤波算法,缺乏从设计图像编解码器整体考虑,不利于实际情况下的设计使用,有些方法提出了基于指令字的实现方式,包括指令存储和指令解码过程,适合于软件加速器的设计实现,不适合硬件的设计结构。从而当面向实际需求中对高清晰度图像的实时编解码处理时,尤其是目前视频图像的处理速度越来越高,这些方法往往因为硬件实现困难或者代价太大而不能实用,而亟待加以进一步改进。The MPEG4-part 10/h.264 and AVS standards are the latest international/domestic video codec standards, and the proposed solutions to the complexity of the above-mentioned loop filter are very limited, and only a few academic papers address This problem was analyzed and discussed, and some implementation methods were proposed, but some implementation methods did not consider the loop filtering process, the reference frame storage process and the motion compensation process at the same time. Overall consideration is not conducive to the design and use in actual situations. Some methods propose an implementation based on instruction words, including instruction storage and instruction decoding processes, which are suitable for the design and implementation of software accelerators, but not suitable for hardware design structures. Therefore, when facing the real-time encoding and decoding processing of high-definition images in actual needs, especially at present, the processing speed of video images is getting higher and higher. These methods are often not practical because of the difficulty in hardware implementation or the high cost, and urgently need to be further improved. Improve.
发明内容Contents of the invention
本发明所要解决的技术问题在于提供一种基于多级并行流水线方式的环路滤波装置,利用多级并行流水线结构完成视频图像各个宏块边界的环路滤波,保证图像编解码过程中图像环路滤波操作的实时性,减少访问外部存储器的压力,并且控制了整个硬件结构的复杂性,从而有利于设计和实现。The technical problem to be solved by the present invention is to provide a loop filter device based on a multi-stage parallel pipeline method, which uses a multi-stage parallel pipeline structure to complete the loop filter of each macroblock boundary of a video image, and ensures that the image loop in the image encoding and decoding process The real-time nature of the filtering operation reduces the pressure of accessing external memory, and controls the complexity of the entire hardware structure, which is beneficial to design and implementation.
本发明所要解决的技术问题是通过如下技术方案实现的:The technical problem to be solved by the present invention is achieved through the following technical solutions:
一种基于多级并行流水线方式的环路滤波装置,它包括三级流水线模块:宏块边界计算模块、垂直/水平滤波模块及输出模块;三级流水线模块按流水顺序同时运行;A loop filtering device based on a multi-stage parallel pipeline mode, which includes a three-stage pipeline module: a macroblock boundary calculation module, a vertical/horizontal filter module and an output module; the three-stage pipeline modules run simultaneously in a pipeline order;
所述的宏块边界计算模块包括滤波边界强度计算器、滤波边界阈值计算器及先进先出数据缓存器;所述的滤波边界强度计算器及滤波边界阈值计算器分别从外部存储器提取需要滤波宏块的边界数据,计算出整个宏块中各个块的边界强度及阈值,存入先进先出数据缓存器;所述的先进先出数据缓存器按先进先出的顺序将宏块滤波的边界强度及阈值输入到垂直/水平滤波模块;The macroblock boundary calculation module includes a filter boundary strength calculator, a filter boundary threshold calculator and a first-in-first-out data buffer; the filter boundary strength calculator and the filter boundary threshold calculator extract the required filter macros from the external memory respectively. Boundary data of the block, calculate the boundary strength and threshold value of each block in the whole macroblock, store in the first-in-first-out data buffer; the described first-in-first-out data buffer filters the boundary strength of the macroblock filter and the threshold are input to the vertical/horizontal filtering module;
所述的垂直/水平滤波模块包括垂直/水平滤波器、排列器、中间数据缓存器、选择器及环路滤波控制器;所述的选择器连接外部存储器,在环路滤波控制器的控制下,所述的选择器提取外部存储器中需要滤波的宏块边界数据,存入中间数据缓存器;所述的中间数据缓存器连接有排列器,该排列器将宏块边界数据按行列顺序排列并依次输入到垂直/水平滤波器;所述的垂直/水平滤波器在环路滤波控制器的控制下对排列器传来的行列数据进行垂直/水平边界滤波,将滤波后的结果中需要再次滤波的数据写回到中间数据缓存器,把已经完成滤波的数据写到输出模块;Described vertical/horizontal filtering module comprises vertical/horizontal filter, aligner, intermediate data register, selector and loop filter controller; Described selector connects external memory, under the control of loop filter controller , the selector extracts the macroblock boundary data that needs to be filtered in the external memory, and stores them in the intermediate data buffer; the intermediate data buffer is connected with an arranger, which arranges the macroblock boundary data in row and column order and Input to the vertical/horizontal filter in turn; the vertical/horizontal filter performs vertical/horizontal boundary filtering on the row and column data from the arranger under the control of the loop filter controller, and the filtered result needs to be filtered again Write the data back to the intermediate data buffer, and write the filtered data to the output module;
所述的输出模块包括输出数据缓存器及写外存接口。所述的输出数据缓存器存储从垂直/水平滤波器传来的已经完成滤波的数据,并将数据传输到写外存接口;所述的写外存接口在环路滤波控制器的控制下将完成滤波的数据写到外部存储器。The output module includes an output data buffer and an interface for writing external memory. The output data buffer stores the data that has been filtered from the vertical/horizontal filter, and transmits the data to the write external memory interface; the described write external memory interface will The filtered data is written to the external memory.
整个装置分为三级流水线完成环路滤波操作:第一级流水线对将要滤波宏块各个滤波边界的边界强度以及边界阈值进行计算;第二级流水线对将要滤波宏块各条滤波边界进行垂直滤波和水平滤波;第三级流水线把已经完成滤波的宏块数据输出到外部存储器。通过三级流水线操作,把环路滤波操作的各个过程分配到不同的流水线步骤,即减少了每级流水线操作的复杂性,又提高了整个环路滤波操作的并行性。The whole device is divided into three-stage pipelines to complete the loop filtering operation: the first-stage pipeline calculates the boundary strength and boundary threshold of each filter boundary of the macroblock to be filtered; the second-stage pipeline performs vertical filtering on each filter boundary of the macroblock to be filtered and horizontal filtering; the third-stage pipeline outputs the filtered macroblock data to the external memory. Through the three-stage pipeline operation, each process of the loop filter operation is allocated to different pipeline steps, which not only reduces the complexity of each pipeline operation, but also improves the parallelism of the entire loop filter operation.
在整个环路滤波装置中,第二级流水线操作中水平边界和垂直边界滤波过程也设计为多级流水线方式,这样利用两条并行的流水线结构,完全可以满足高清晰度视频实时编解码过程中对环路滤波操作的要求。In the entire loop filtering device, the horizontal border and vertical border filtering process in the second-stage pipeline operation is also designed as a multi-stage pipeline, so that the use of two parallel pipeline structures can fully meet the requirements of high-definition video in the process of real-time encoding and decoding. requirements for loop filter operation.
所述的垂直/水平滤波器包括滤波条件判决器模块及滤波计算器模块;The vertical/horizontal filter includes a filter condition determiner module and a filter calculator module;
所述的滤波条件判决器模块包括滤波条件判决器及数据缓存器,所述的滤波条件判决器接收排列器传来的行列数据、先进先出数据缓存器传来的宏块滤波的边界强度及阈值,计算出边界滤波判决条件,决定将要进行滤波操作的边界象素点,并把边界象素点通过数据缓存器输入到下级流水线进行滤波;The filter condition determiner module includes a filter condition determiner and a data buffer, and the filter condition determiner receives the row and column data from the arranger, the boundary strength of the macroblock filter from the first-in-first-out data buffer, and Threshold, calculate the boundary filtering decision condition, determine the boundary pixel point to be filtered, and input the boundary pixel point to the lower pipeline through the data buffer for filtering;
所述的滤波计算器模块包括滤波计算器,其对各个象素点进行滤波计算,并将滤波计算后的象素点输出。The filter calculator module includes a filter calculator, which performs filter calculation on each pixel and outputs the filtered pixel.
所述的滤波条件判决器又分为两级或两级以上的流水线模块,两级或两级以上的流水线模块按流水顺序同时运行。The filter condition determiner is further divided into pipeline modules of two or more stages, and the pipeline modules of two or more stages run simultaneously according to the pipeline sequence.
所述的滤波计算器又分为两级或两级以上的流水线模块,两级或两级以上的流水线模块按流水顺序同时运行。The filter calculator is further divided into pipeline modules of two or more stages, and the pipeline modules of two or more stages run simultaneously according to the pipeline sequence.
本发明通过下列步骤完成对宏块边界的滤波:The present invention completes the filtering of the macroblock boundary through the following steps:
步骤一、在环路滤波控制器的控制下,选择器提取外部存储器中需要滤波的宏块边界数据,存入中间数据缓存器,再通过排列器将宏块边界数据按行列顺序排列,并依次输入到垂直/水平滤波器;同时滤波边界强度计算器及滤波边界阈值计算器分别从外部存储器提取需要滤波宏块的边界数据,计算出整个宏块中各个块的边界强度及阈值,存入先进先出数据缓存器,先进先出数据缓存器按先进先出的顺序将块边界强度及阈值输入到垂直/水平滤波器。Step 1. Under the control of the loop filter controller, the selector extracts the macroblock boundary data that needs to be filtered in the external memory, stores them in the intermediate data buffer, and then arranges the macroblock boundary data in row and column order through the arranger, and sequentially Input to the vertical/horizontal filter; at the same time, the filter boundary strength calculator and the filter boundary threshold calculator respectively extract the boundary data of the macroblock to be filtered from the external memory, calculate the boundary strength and threshold value of each block in the entire macroblock, and store them in the advanced The first-out data buffer, the first-in-first-out data buffer inputs the block boundary strength and threshold to the vertical/horizontal filter in the order of first-in-first-out.
步骤二、滤波控制器使用上一级操作计算的块边界滤波强度和边界阈值对整个宏块边界滤波,环路滤波控制器按照块级流水线方式控制整个宏块边界的垂直/水平滤波过程;将滤波后的结果中需要再次滤波的数据写回到中间数据缓存器,把已经完成滤波的数据写到输出模块。Step 2: The filtering controller uses the block boundary filtering strength and boundary threshold value calculated by the previous operation to filter the entire macroblock boundary, and the loop filtering controller controls the vertical/horizontal filtering process of the entire macroblock boundary according to the block-level pipeline mode; In the filtered result, the data that needs to be filtered again is written back to the intermediate data buffer, and the filtered data is written to the output module.
步骤三、完成对一个块数据的滤波操作同时,从中间数据缓存器中把下一个需要滤波的块边界的行/列数据读出,按照以上方法依次送入垂直/水平滤波器进行滤波操作;Step 3, while completing the filtering operation of a block data, read out the row/column data of the next block boundary that needs to be filtered from the intermediate data buffer, and send it to the vertical/horizontal filter successively according to the above method for filtering operation;
步骤四、反复执行步骤二及步骤三的操作,直到完成当前整个宏块边界数据的过滤;Step 4, repeatedly performing the operations of step 2 and step 3 until the filtering of the current entire macroblock boundary data is completed;
步骤五、在对当前宏块的各个块数据滤波同时,垂直/水平滤波器输出的最终数据写入输出数据缓存,等待写外存接口进行写外部存储器的操作。Step 5. While filtering the data of each block of the current macroblock, the final data output by the vertical/horizontal filter is written into the output data cache, waiting for the write-to-external-storage interface to perform an operation to write to the external memory.
本发明利用多级并行流水线结构完成视频图像各个宏块边界的环路滤波,同时考虑到了编解码器设计当中的参考帧存储过程和环路滤波的实际接口问题,保证了图像编解码过程中图像环路滤波操作的实时性,减少了访问外部存储器的压力,并且控制了整个硬件结构的复杂性,从而有利于设计和实现。The present invention uses a multi-stage parallel pipeline structure to complete the loop filtering of each macroblock boundary of the video image, and at the same time takes into account the reference frame storage process and the actual interface problem of the loop filtering in the design of the codec, ensuring that the image in the process of image encoding and decoding is The real-time nature of the loop filter operation reduces the pressure of accessing external memory and controls the complexity of the entire hardware structure, which is beneficial to design and implementation.
附图说明Description of drawings
图1宏块环路滤波流程图;Fig. 1 macroblock loop filtering flowchart;
图2块垂直水平滤波边界图;Figure 2 block vertical and horizontal filtering boundary map;
图3为本发明的结构示意图;Fig. 3 is a structural representation of the present invention;
图4二级流水水平/垂直边界滤波器示意图。Fig. 4 is a schematic diagram of the horizontal/vertical boundary filter of the two-stage pipeline.
具体实施方式Detailed ways
以下结合附图和具体实施例对本发明的技术方案进一步说明:Below in conjunction with accompanying drawing and specific embodiment the technical scheme of the present invention is further described:
第一实施例:First embodiment:
一种基于多级并行流水线方式的环路滤波装置及实现AVS标准的环路滤波算法的步骤:A kind of loop filtering device based on multi-stage parallel pipeline mode and the steps of realizing the loop filtering algorithm of AVS standard:
如图3所示,一种基于多级并行流水线方式的环路滤波装置,它包括三级流水线模块:宏块边界计算模块、垂直/水平滤波模块及输出模块;三级流水线模块按流水顺序同时运行;As shown in Figure 3, a kind of loop filter device based on multi-stage parallel pipeline mode, it comprises three-stage pipeline module: macroblock boundary calculation module, vertical/horizontal filter module and output module; Three-stage pipeline module simultaneously run;
所述的宏块边界计算模块包括滤波边界强度计算器、滤波边界阈值计算器及FIFO(先进先出)数据缓存器;所述的滤波边界强度计算器及滤波边界阈值计算器分别从外部存储器提取需要滤波宏块的边界数据,计算出整个宏块中各个块的边界强度及阈值,存入FIFO数据缓存器;所述的FIFO数据缓存器按先进先出的顺序将宏块滤波的边界强度及阈值输入到垂直/水平滤波模块;Described macroblock boundary calculation module comprises filter boundary intensity calculator, filter boundary threshold value calculator and FIFO (first in first out) data register; Described filter boundary intensity calculator and filter boundary threshold value calculator extract respectively from external memory It is necessary to filter the boundary data of the macroblock, calculate the boundary strength and threshold value of each block in the whole macroblock, and store it in the FIFO data buffer; the described FIFO data buffer filters the boundary strength and The threshold is input to the vertical/horizontal filtering module;
所述的垂直/水平滤波模块包括垂直/水平滤波器、排列器、中间数据缓存器、选择器及环路滤波控制器;所述的选择器连接外部存储器,在环路滤波控制器的控制下,所述的选择器提取外部存储器中需要滤波的宏块边界数据,存入中间数据缓存器;所述的中间数据缓存器连接有排列器,该排列器将宏块边界数据按行列顺序排列并依次输入到垂直/水平滤波器;所述的垂直/水平滤波器在环路滤波控制器的控制下对排列器传来的行列数据进行垂直/水平边界滤波,将滤波后的结果中需要再次滤波的数据写回到中间数据缓存器,把已经完成滤波的数据写到输出模块。Described vertical/horizontal filtering module comprises vertical/horizontal filter, aligner, intermediate data register, selector and loop filter controller; Described selector connects external memory, under the control of loop filter controller , the selector extracts the macroblock boundary data that needs to be filtered in the external memory, and stores them in the intermediate data buffer; the intermediate data buffer is connected with an arranger, which arranges the macroblock boundary data in row and column order and Input to the vertical/horizontal filter in turn; the vertical/horizontal filter performs vertical/horizontal boundary filtering on the row and column data from the arranger under the control of the loop filter controller, and the filtered result needs to be filtered again Write the data back to the intermediate data buffer, and write the filtered data to the output module.
所述的输出模块包括输出数据缓存器及写外存接口;所述的输出数据缓存器存储从垂直/水平边界滤波传来的已经完成滤波的数据,并将数据传输到写外存接口;所述的写外存接口在环路滤波控制器的控制下将完成滤波的数据写到外部存储器。The output module includes an output data buffer and a write external storage interface; the output data buffer stores the filtered data from the vertical/horizontal boundary filter, and transmits the data to the write external storage interface; The write external memory interface described above writes the filtered data to the external memory under the control of the loop filter controller.
整个装置分为三级流水线完成环路滤波操作:第一级流水线对将要滤波宏块各条滤波边界的边界强度以及边界阈值进行计算;第二级流水线对将要滤波宏块各条滤波边界进行垂直滤波和水平滤波;第三级流水线把已经完成滤波的宏块数据输出到外部存储器。通过三级流水线操作,把环路滤波操作的各个过程分配到不同的流水线步骤,即减少了每级流水线操作的复杂性,又提高了整个环路滤波操作的并行性。The whole device is divided into three-stage pipelines to complete the loop filtering operation: the first-stage pipeline calculates the boundary strength and boundary threshold of each filter boundary of the macroblock to be filtered; the second-stage pipeline performs vertical filtering on each filter boundary of the macroblock to be filtered Filtering and horizontal filtering; the third-stage pipeline outputs the filtered macroblock data to the external memory. Through the three-stage pipeline operation, each process of the loop filter operation is allocated to different pipeline steps, which not only reduces the complexity of each pipeline operation, but also improves the parallelism of the entire loop filter operation.
本发明通过下列步骤实现AVS标准的环路滤波算法:The present invention realizes the loop filter algorithm of AVS standard through the following steps:
步骤一、计算宏块中8x8块数据的边界滤波强度和边界阈值;Step 1, calculating the boundary filtering strength and boundary threshold of 8x8 block data in the macroblock;
步骤二、同时从外存中读入需要滤波的左边、上边和当前块的数据,把这些数据放入中间数据缓存器;Step 2, simultaneously read in the data of the left side, the upper side and the current block to be filtered from the external memory, and put these data into the intermediate data buffer;
步骤三、当前宏块的块数据、边界滤波强度和边界阈值准备好后,和左边块待滤波的数据经过中间数据缓存器,同时送入数据排列器,然后按照行顺序依次送入垂直/水平滤波器中;Step 3: After the block data, boundary filtering strength and boundary threshold of the current macroblock are ready, the data to be filtered and the data to be filtered of the left block pass through the intermediate data buffer, and are sent to the data arranger at the same time, and then sent to the vertical/horizontal in the filter;
步骤四、垂直/水平滤波器中按流水线结构对垂直边界数据进行滤波,同时滤波后的数据依次写回到中间数据缓存器;Step 4, in the vertical/horizontal filter, the vertical boundary data is filtered according to the pipeline structure, and the filtered data is written back to the intermediate data buffer in turn;
步骤五、完成当前宏块的一个块数据的垂直边界滤波后,把可以输出的左边宏块的数据写入输出数据缓存器,通知输出接口进行写外存操作,同时开始当前宏块的下一个块边界垂直滤波;Step 5. After completing the vertical boundary filtering of a block data of the current macroblock, write the data of the left macroblock that can be output into the output data buffer, notify the output interface to write the external memory operation, and start the next block of the current macroblock at the same time. Block boundary vertical filtering;
步骤六、反复执行步骤三、步骤四或步骤五的操作,直到完成所有块边界的垂直滤波;Step 6, repeatedly perform the operation of step 3, step 4 or step 5 until the vertical filtering of all block boundaries is completed;
步骤七、从中间缓存器中取出上边和当前宏块边界中等待水平滤波的数据,送入数据排列器;Step 7, take out the data waiting for horizontal filtering in the upper side and the current macroblock boundary from the intermediate buffer, and send it to the data arranger;
步骤八、当前宏块的块数据和上边待滤波的块数据经过中间数据缓存器,同时送入数据排列器,然后按照列顺序依次送入垂直/水平滤波器;Step 8, the block data of the current macroblock and the block data to be filtered on the upper side pass through the intermediate data buffer, and are sent to the data arranger at the same time, and then sent to the vertical/horizontal filter in sequence according to the column order;
步骤九、垂直/水平滤波器中按流水线结构对水平边界数据进行滤波,同时滤波后的数据依次写回到中间数据缓存器;Step 9: In the vertical/horizontal filter, the horizontal boundary data is filtered according to the pipeline structure, and the filtered data is written back to the intermediate data buffer in turn;
步骤十、完成当前宏块的一个块的水平边界滤波后,把上边宏块中可以输出的数据写入输出数据缓存器,通知输出接口进行写外存操作,同时开始当前宏块的下一个块边界水平滤波;Step 10. After completing the horizontal boundary filtering of a block of the current macroblock, write the data that can be output in the upper macroblock into the output data buffer, notify the output interface to write the external memory operation, and start the next block of the current macroblock at the same time Boundary horizontal filtering;
步骤十一、反复执行步骤八、步骤九或步骤十的操作,直到完成所有块边界的水平滤波;Step 11, repeatedly perform the operation of step 8, step 9 or step 10 until the horizontal filtering of all block boundaries is completed;
步骤十二、按照以上方式完成一帧内各个宏块的环路滤波并输出到外存。Step 12: Complete the loop filtering of each macroblock in a frame according to the above method and output it to the external memory.
实施例二:Embodiment two:
一种基于多级并行流水线方式的环路滤波装置及实现MPEG4-part10/h.264标准的环路滤波算法的步骤:A kind of loop filtering device based on multi-stage parallel pipeline mode and the steps of realizing the loop filtering algorithm of MPEG4-part10/h.264 standard:
如图3所示,一种基于多级并行流水线方式的环路滤波装置,它包括三级流水线模块:宏块边界计算模块、垂直/水平滤波模块及输出模块;三级流水线模块按流水顺序同时运行;As shown in Figure 3, a kind of loop filter device based on multi-stage parallel pipeline mode, it comprises three-stage pipeline module: macroblock boundary calculation module, vertical/horizontal filter module and output module; Three-stage pipeline module simultaneously run;
所述的宏块边界计算模块包括滤波边界强度计算器、滤波边界阈值计算器及FIFO(先进先出)数据缓存器;所述的滤波边界强度计算器及滤波边界阈值计算器分别从外部存储器提取需要滤波宏块的边界数据,计算出整个宏块中各个块的边界强度及阈值,存入FIFO数据缓存器;所述的FIFO数据缓存器按先进先出的顺序将宏块滤波的边界强度及阈值输入到垂直/水平滤波模块;Described macroblock boundary calculation module comprises filter boundary intensity calculator, filter boundary threshold value calculator and FIFO (first in first out) data register; Described filter boundary intensity calculator and filter boundary threshold value calculator extract respectively from external memory It is necessary to filter the boundary data of the macroblock, calculate the boundary strength and threshold value of each block in the whole macroblock, and store it in the FIFO data buffer; the described FIFO data buffer filters the boundary strength and The threshold is input to the vertical/horizontal filtering module;
所述的垂直/水平滤波模块包括垂直/水平滤波器、排列器、中间数据缓存器、选择器及环路滤波控制器;所述的选择器连接外部存储器,在环路滤波控制器的控制下,所述的选择器提取外部存储器中需要滤波的宏块边界数据,存入中间数据缓存器;所述的中间数据缓存器连接有排列器,该排列器将宏块边界数据按行列顺序排列并依次输入到垂直/水平滤波器;所述的垂直/水平滤波器在环路滤波控制器的控制下对排列器传来的行列数据进行垂直/水平边界滤波,将滤波后的结果中需要再次滤波的数据写回到中间数据缓存器,把已经完成滤波的数据写到输出模块。Described vertical/horizontal filtering module comprises vertical/horizontal filter, aligner, intermediate data register, selector and loop filter controller; Described selector connects external memory, under the control of loop filter controller , the selector extracts the macroblock boundary data that needs to be filtered in the external memory, and stores them in the intermediate data buffer; the intermediate data buffer is connected with an arranger, which arranges the macroblock boundary data in row and column order and Input to the vertical/horizontal filter in turn; the vertical/horizontal filter performs vertical/horizontal boundary filtering on the row and column data from the arranger under the control of the loop filter controller, and the filtered result needs to be filtered again Write the data back to the intermediate data buffer, and write the filtered data to the output module.
所述的输出模块包括输出数据缓存器及写外存接口;所述的输出数据缓存器存储从垂直/水平边界滤波传来的已经完成滤波的数据,并将数据传输到写外存接口;所述的写外存接口在环路滤波控制器的控制下将完成滤波的数据写到外部存储器。The output module includes an output data buffer and a write external storage interface; the output data buffer stores the filtered data from the vertical/horizontal boundary filter, and transmits the data to the write external storage interface; The write external memory interface described above writes the filtered data to the external memory under the control of the loop filter controller.
整个装置分为三级流水线完成环路滤波操作:第一级流水线对将要滤波宏块各条滤波边界的边界强度以及边界阈值进行计算;第二级流水线对将要滤波宏块各条滤波边界进行垂直滤波和水平滤波;第三级流水线把已经完成滤波的宏块数据输出到外部存储器。通过三级流水线操作,把环路滤波操作的各个过程分配到不同的流水线步骤,即减少了每级流水线操作的复杂性,又提高了整个环路滤波操作的并行性。The whole device is divided into three-stage pipelines to complete the loop filtering operation: the first-stage pipeline calculates the boundary strength and boundary threshold of each filter boundary of the macroblock to be filtered; the second-stage pipeline performs vertical filtering on each filter boundary of the macroblock to be filtered Filtering and horizontal filtering; the third-stage pipeline outputs the filtered macroblock data to the external memory. Through the three-stage pipeline operation, each process of the loop filter operation is allocated to different pipeline steps, which not only reduces the complexity of each pipeline operation, but also improves the parallelism of the entire loop filter operation.
如图4所示,在整个环路滤波装置中,第二级流水线操作中水平边界和垂直边界滤波过程也设计为多级流水线方式,这样利用两条并行的流水线结构,完全可以满足高清晰度视频实时编解码过程中对环路滤波操作的要求。As shown in Figure 4, in the entire loop filtering device, the horizontal boundary and vertical boundary filtering process in the second-stage pipeline operation is also designed as a multi-stage pipeline, so that the use of two parallel pipeline structures can fully meet the high-definition Requirements for loop filtering operation in the process of real-time video encoding and decoding.
其中,垂直/水平滤波器包括滤波条件判决器模块及滤波计算器模块;所述的滤波条件判决器模块包括滤波条件判决器及数据缓存器,所述的滤波条件判决器接收排列器传来的行列数据、先进先出数据缓存器传来的宏块滤波的边界强度及阈值,计算出边界滤波判决条件,决定将要进行滤波操作的边界象素点,并把边界象素点通过数据缓存器输入到下级流水线进行滤波;所述的滤波计算器模块包括滤波计算器,其对各个象素点进行滤波计算,并将滤波计算后的象素点输出。Wherein, the vertical/horizontal filter includes a filter condition determiner module and a filter calculator module; the filter condition determiner module includes a filter condition determiner and a data buffer, and the filter condition determiner receives the information sent by the arranger The row and column data, the boundary strength and threshold value of the macroblock filter transmitted from the first-in-first-out data buffer, calculate the boundary filter judgment condition, determine the boundary pixel point to be filtered, and input the boundary pixel point through the data buffer to the lower pipeline for filtering; the filter calculator module includes a filter calculator, which performs filter calculation on each pixel, and outputs the pixel after filter calculation.
滤波条件判决器可分为两级或两级以上的流水线模块,两级或两级以上的流水线模块按流水顺序同时运行。滤波计算器可分为两级或两级以上的流水线模块,两级或两级以上的流水线模块按流水顺序同时运行。The filter condition determiner can be divided into pipeline modules of two or more stages, and the pipeline modules of two or more stages run simultaneously according to the pipeline sequence. The filter calculator can be divided into pipeline modules with two or more stages, and the pipeline modules with two or more stages run simultaneously according to the pipeline sequence.
本发明通过下列步骤实现MPEG4-part 10/h.264标准的环路滤波算法:The present invention realizes the loop filtering algorithm of MPEG4-part 10/h.264 standard by following steps:
步骤一、计算宏块中4x4块数据的边界滤波强度和边界阈值;和AVS标准相比,需要滤波的块数多,但是需要的中间缓存器的大小会减小;Step 1, calculating the boundary filtering strength and the boundary threshold of the 4x4 block data in the macroblock; compared with the AVS standard, the number of blocks to be filtered is large, but the size of the required intermediate buffer will be reduced;
步骤二、同时从外存中读入需要滤波的左边、上边和当前块的数据,把这些数据放入中间数据缓存器;Step 2, simultaneously read in the data of the left side, the upper side and the current block to be filtered from the external memory, and put these data into the intermediate data buffer;
步骤三、当前宏块的块数据、边界滤波强度和边界阈值准备好后,和左边块待滤波的数据经过中间数据缓存器,同时送入数据排列器,然后按照行顺序依次送入垂直/水平滤波器中;Step 3: After the block data, boundary filtering strength and boundary threshold of the current macroblock are ready, the data to be filtered and the data to be filtered of the left block pass through the intermediate data buffer, and are sent to the data arranger at the same time, and then sent to the vertical/horizontal in the filter;
步骤四、垂直/水平滤波器中按流水线结构对垂直边界数据进行滤波,同时滤波后的数据依次写回到中间数据缓存器;Step 4, in the vertical/horizontal filter, the vertical boundary data is filtered according to the pipeline structure, and the filtered data is written back to the intermediate data buffer in turn;
步骤五、完成当前宏块的一个块数据的垂直边界滤波后,把可以输出的左边宏块的数据写入输出数据缓存器,通知输出接口进行写外存操作,同时开始当前宏块的下一个块边界垂直滤波;Step 5. After completing the vertical boundary filtering of a block data of the current macroblock, write the data of the left macroblock that can be output into the output data buffer, notify the output interface to write the external memory operation, and start the next block of the current macroblock at the same time. Block boundary vertical filtering;
步骤六、反复执行步骤三、步骤四或步骤五的操作,直到完成所有块边界的垂直滤波;Step 6, repeatedly perform the operation of step 3, step 4 or step 5 until the vertical filtering of all block boundaries is completed;
步骤七、从中间缓存器中取出上边和当前宏块边界中等待水平滤波的数据,送入数据排列器;Step 7, take out the data waiting for horizontal filtering in the upper side and the current macroblock boundary from the intermediate buffer, and send it to the data arranger;
步骤八、当前宏块的块数据和上边待滤波的块数据经过中间数据缓存器,同时送入数据排列器,然后按照列顺序依次送入垂直/水平滤波器;Step 8, the block data of the current macroblock and the block data to be filtered on the upper side pass through the intermediate data buffer, and are sent to the data arranger at the same time, and then sent to the vertical/horizontal filter in sequence according to the column order;
步骤九、垂直/水平滤波器中按流水线结构对水平边界数据进行滤波,同时滤波后的数据依次写回到中间数据缓存器;Step 9: In the vertical/horizontal filter, the horizontal boundary data is filtered according to the pipeline structure, and the filtered data is written back to the intermediate data buffer in turn;
步骤十、完成当前宏块的一个块的水平边界滤波后,把上边宏块中可以输出的数据写入输出数据缓存器,通知输出接口进行写外存操作,同时开始当前宏块的下一个块边界水平滤波;Step 10. After completing the horizontal boundary filtering of a block of the current macroblock, write the data that can be output in the upper macroblock into the output data buffer, notify the output interface to write the external memory operation, and start the next block of the current macroblock at the same time Boundary horizontal filtering;
步骤十一、反复执行步骤八、步骤九或步骤十的操作,直到完成所有块边界的水平滤波;Step 11, repeatedly perform the operation of step 8, step 9 or step 10 until the horizontal filtering of all block boundaries is completed;
步骤十二、按照以上方式完成一帧内各个宏块的环路滤波并输出到外存。Step 12: Complete the loop filtering of each macroblock in a frame according to the above method and output it to the external memory.
本发明利用多级并行流水线结构完成视频图像各个宏块边界的环路滤波,保证了图像编解码过程中图像环路滤波操作的实时性,减少了访问外部存储器的压力,并且控制了整个硬件结构的复杂性,从而有利于设计和实现。The present invention uses a multi-stage parallel pipeline structure to complete the loop filtering of each macroblock boundary of the video image, which ensures the real-time performance of the image loop filtering operation in the process of image encoding and decoding, reduces the pressure of accessing external memory, and controls the entire hardware structure complexity, which facilitates design and implementation.
最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention without limitation. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent replacements of the technical solutions without departing from the spirit and scope of the technical solutions of the present invention shall be covered by the scope of the claims of the present invention.
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB2004100702068A CN1306826C (en) | 2004-07-30 | 2004-07-30 | Loop filter based on multistage parallel pipeline mode |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB2004100702068A CN1306826C (en) | 2004-07-30 | 2004-07-30 | Loop filter based on multistage parallel pipeline mode |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1589032A CN1589032A (en) | 2005-03-02 |
| CN1306826C true CN1306826C (en) | 2007-03-21 |
Family
ID=34604436
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNB2004100702068A Expired - Lifetime CN1306826C (en) | 2004-07-30 | 2004-07-30 | Loop filter based on multistage parallel pipeline mode |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN1306826C (en) |
Families Citing this family (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN100438630C (en) * | 2006-03-31 | 2008-11-26 | 清华大学 | Multi-pipeline phase information sharing method based on data buffer storage |
| CN100446573C (en) * | 2006-06-22 | 2008-12-24 | 上海交通大学 | VLSI Realization Device of Deblocking Filter Based on AVS |
| CN101115195B (en) * | 2006-07-24 | 2010-08-18 | 同济大学 | Macroblock grade coupled decoding and loop filtering method and apparatus for video code stream |
| JP4712642B2 (en) * | 2006-08-17 | 2011-06-29 | 富士通セミコンダクター株式会社 | Deblocking filter, image encoding device, and image decoding device |
| CN101193305B (en) * | 2006-11-21 | 2010-05-12 | 安凯(广州)微电子技术有限公司 | Inter-frame prediction data storage and exchange method for video coding and decoding chip |
| CN101005619B (en) * | 2006-12-25 | 2010-09-01 | 海信集团有限公司 | Loop circuit filtering method |
| CN101841722B (en) * | 2010-06-08 | 2011-08-31 | 上海交通大学 | Detection method of detection device of filtering boundary strength |
| CN102572416B (en) * | 2010-12-22 | 2014-11-05 | 中兴通讯股份有限公司 | Video filtering method and device |
| CN102098515B (en) * | 2011-02-18 | 2012-12-12 | 杭州海康威视数字技术股份有限公司 | Realizing method of loop filtering parallel |
| CN102223538A (en) * | 2011-06-17 | 2011-10-19 | 中兴通讯股份有限公司 | Parallel filtering method and device |
| CN103731674B (en) * | 2014-01-17 | 2017-02-01 | 合肥工业大学 | H.264 two-dimensional parallel post-processing block removing filter hardware achieving method |
| CN105791865B (en) * | 2014-12-22 | 2020-01-17 | 江苏省电力公司南京供电公司 | Intra-frame prediction and deblocking filtering method |
| CN107392838B (en) * | 2017-07-27 | 2020-11-27 | 苏州浪潮智能科技有限公司 | Method and device for parallel acceleration of WebP compression based on OpenCL |
| US10939102B2 (en) * | 2018-11-01 | 2021-03-02 | Mediatek Inc. | Post processing apparatus with super-resolution filter and loop restoration filter in block-level pipeline and associated post processing method |
| CN115002488A (en) * | 2021-03-01 | 2022-09-02 | 炬芯科技股份有限公司 | In-loop filter for codec and filtering method |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1170318A (en) * | 1996-07-06 | 1998-01-14 | 三星电子株式会社 | Loop filtering method for reducing group effect of moving compensation image and ringing noise |
| CN1189652A (en) * | 1997-01-29 | 1998-08-05 | 三星电子株式会社 | Loop filter and loop filtering method |
| US20030026337A1 (en) * | 2001-06-15 | 2003-02-06 | Lg Electronics Inc. | Loop filtering method in video coder |
-
2004
- 2004-07-30 CN CNB2004100702068A patent/CN1306826C/en not_active Expired - Lifetime
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1170318A (en) * | 1996-07-06 | 1998-01-14 | 三星电子株式会社 | Loop filtering method for reducing group effect of moving compensation image and ringing noise |
| JPH1066090A (en) * | 1996-07-06 | 1998-03-06 | Samsung Electron Co Ltd | Loop Filtering Method for Blocking Effect and Ringing Noise Reduction of Motion Compensated Video |
| CN1189652A (en) * | 1997-01-29 | 1998-08-05 | 三星电子株式会社 | Loop filter and loop filtering method |
| US20030026337A1 (en) * | 2001-06-15 | 2003-02-06 | Lg Electronics Inc. | Loop filtering method in video coder |
Also Published As
| Publication number | Publication date |
|---|---|
| CN1589032A (en) | 2005-03-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN1306826C (en) | Loop filter based on multistage parallel pipeline mode | |
| CN1794814A (en) | Pipelined deblocking filter | |
| CN101039430A (en) | Method for scanning quickly residual matrix in video coding | |
| CN104253998B (en) | Hardware on-chip storage method of deblocking effect filter applying to HEVC (High Efficiency Video Coding) standard | |
| CN101047850A (en) | In-frame prediction processing | |
| CN100531392C (en) | Hardware implementation method for H.264 block elimination effect filter | |
| CN101115207B (en) | Method and device for implementing intra prediction based on correlation between prediction points | |
| CN1921625A (en) | Filtering method and system of deblocking effect filter | |
| CN101841722B (en) | Detection method of detection device of filtering boundary strength | |
| CN1874516A (en) | Implementation device in VLSI of filter for removing blocking effect based on AVS | |
| CN105635731B (en) | The inter-frame predicated reference point preprocess method of efficient video coding | |
| CN101459839A (en) | Deblocking effect filtering method and apparatus for implementing the method | |
| CN102801973B (en) | Video image deblocking filter method and device | |
| TWI517695B (en) | On die/off die memory management | |
| CN1230000C (en) | Scanning method and device for transform coefficient block in video codec | |
| CN1852442A (en) | Layering motion estimation method and super farge scale integrated circuit | |
| CN105376586A (en) | Three-level flow line hardware architecture suitable for integer motion estimation in HEVC standard | |
| CN102075753A (en) | Method for deblocking filtration in video coding and decoding | |
| CN1271864C (en) | Control device and method for video frequency decoding buffer zone | |
| CN105872553A (en) | Method for adaptive loop filter based on parallel computing | |
| CN1655617A (en) | Unified decoder architecture | |
| CN1452409A (en) | Picture motion estimating method | |
| CN102186082B (en) | H.264 protocol based optimized decoding method for intra-frame coding compression technology | |
| CN101005619A (en) | Loop circuit filtering method | |
| CN1787382A (en) | Method for buffer area read-write by reducing buffer area size of on-line image compression data |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| ASS | Succession or assignment of patent right |
Owner name: ZHANXUN COMMUNICATIONS (SHANGHAI) CO., LTD. Free format text: FORMER OWNER: UNITED XINYUAN DIGITAL AUDIO-VIDEO TECHNOLOGY (BEIJING) CO., LTD. Effective date: 20070608 |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20070608 Address after: 201203 Shanghai city Zuchongzhi road Pudong Zhangjiang hi tech Park Lane 2288 newshow Center Building No. 1 Patentee after: Spreadtrum Communications (Shanghai) Inc. Address before: 100080 North building, room 6, 140 South Road, Haidian District Academy of Sciences, Beijing Patentee before: UNITED XINYUAN DIGITAL AUDIO V |
|
| TR01 | Transfer of patent right |
Effective date of registration: 20190314 Address after: 101399 Building 8-07, Ronghui Garden 6, Shunyi Airport Economic Core Area, Beijing Patentee after: Xin Xin finance leasing (Beijing) Co.,Ltd. Address before: 201203 Shanghai city Zuchongzhi road Pudong Zhangjiang hi tech Park Lane 2288 newshow Center Building No. 1 Patentee before: Spreadtrum Communications (Shanghai) Inc. |
|
| TR01 | Transfer of patent right | ||
| EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20050302 Assignee: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd. Assignor: Xin Xin finance leasing (Beijing) Co.,Ltd. Contract record no.: X2021110000008 Denomination of invention: Loop filter based on multistage parallel pipeline Granted publication date: 20070321 License type: Exclusive License Record date: 20210317 |
|
| EE01 | Entry into force of recordation of patent licensing contract | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20221018 Address after: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288 Patentee after: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd. Address before: 101399 Building 8-07, Ronghui Garden 6, Shunyi Airport Economic Core Area, Beijing Patentee before: Xin Xin finance leasing (Beijing) Co.,Ltd. |
|
| TR01 | Transfer of patent right | ||
| CX01 | Expiry of patent term | ||
| CX01 | Expiry of patent term |
Granted publication date: 20070321 |