CN1219403C - MPEG video code rate conversion method based on visual model - Google Patents
MPEG video code rate conversion method based on visual model Download PDFInfo
- Publication number
- CN1219403C CN1219403C CN 02157889 CN02157889A CN1219403C CN 1219403 C CN1219403 C CN 1219403C CN 02157889 CN02157889 CN 02157889 CN 02157889 A CN02157889 A CN 02157889A CN 1219403 C CN1219403 C CN 1219403C
- Authority
- CN
- China
- Prior art keywords
- dct
- conversion
- code rate
- quantizing
- coefficient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims abstract description 25
- 230000000007 visual effect Effects 0.000 title abstract description 16
- 238000013139 quantization Methods 0.000 claims abstract description 14
- 238000004364 calculation method Methods 0.000 claims abstract description 12
- 230000008569 process Effects 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 2
- 230000003139 buffering effect Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
一种引入视觉模型的MPEG视频码流码率转换方法,包括步骤:对输入的码流进行部分解码;DCT系数截断,去除高于截止频率的系数;码率控制,重新确定各宏块的量化因子;再编码。本发明在转换中巧妙地利用了Fovea视觉模型,有效地提高转换效率,产生主观质量相对更好低码率码流,并进一步减少了计算量。
A kind of MPEG video code rate conversion method that introduces visual model, comprises steps: carry out partial decoding to input code stream; DCT coefficient truncation, removes the coefficient higher than cut-off frequency; Code rate control, determines the quantization of each macroblock again factor; recoding. The present invention cleverly utilizes the Fovea visual model in the conversion, effectively improves the conversion efficiency, generates relatively better subjective quality and low code rate code stream, and further reduces the amount of calculation.
Description
技术领域technical field
本发明涉及MPEG视频码流码率转换方法。The invention relates to a code rate conversion method of an MPEG video code stream.
背景技术Background technique
随着视频压缩技术和网络技术的发展,各种网络多媒体服务,如多点视频会议、视频点播、数字电视等,不断出现。为了支持各种服务,视频服务器必须适应客户端与传输信道的异质性,从而要求其具有视频码流转换的功能。码流转换包括语法转换、(空间和时间)分辨率转换、码率转换等。本发明针对码率转换,即把已有的视频码流根据传输信道的实际带宽限制转换为与之相适应的更低码率的码流。With the development of video compression technology and network technology, various network multimedia services, such as multi-point video conferencing, video on demand, digital TV, etc., are constantly emerging. In order to support various services, the video server must adapt to the heterogeneity of the client and the transmission channel, thus requiring it to have the function of video code stream conversion. Code stream conversion includes syntax conversion, (spatial and time) resolution conversion, code rate conversion, etc. The present invention aims at code rate conversion, that is, converts the existing video code stream into a corresponding lower code stream according to the actual bandwidth limitation of the transmission channel.
目前视频码流转换有许多方法,可以概括为三类体系结构:(1)级联像素域转换;(2)快速级联像素域转换;(3)DCT(离散余弦变换)域转换。级联像素域转换需要经过完全解码,再重新编码的过程,计算量大,转换速度很慢。DCT域转换直接在DCT域上进行,无需DCT/IDCT过程,计算量很小,但是它的灵活性受到限制,当要求改变运动矢量时很难实现,不易实现扩展。快速级联像素域转换是级联像素域转换的简化版,由于不需要进行运动估计,因此转换速度明显高于级联像素域转换;但有由于要进行DCT/IDCT过程,因此转换速度要低于DCT域转换。At present, there are many methods for video code stream conversion, which can be summarized into three types of architectures: (1) cascaded pixel domain conversion; (2) fast cascaded pixel domain conversion; (3) DCT (discrete cosine transform) domain conversion. The cascaded pixel domain conversion needs to be completely decoded and then re-encoded. The calculation is heavy and the conversion speed is very slow. The DCT domain conversion is performed directly on the DCT domain without the DCT/IDCT process, and the amount of calculation is small, but its flexibility is limited. It is difficult to implement when the motion vector is required to be changed, and it is not easy to expand. Fast cascaded pixel domain conversion is a simplified version of cascaded pixel domain conversion. Since motion estimation is not required, the conversion speed is significantly higher than cascaded pixel domain conversion; however, due to the DCT/IDCT process, the conversion speed is lower. Converted in the DCT domain.
目前已有视频码流转换没有很好地利用人类视觉系统(HVS)特性,导致所转换成的低码率码流不能很好地与HVS特性一致,主观质量较差,转换效率低。At present, the conversion of existing video code streams does not make good use of the characteristics of the Human Visual System (HVS), resulting in the conversion of low-bit-rate code streams that cannot be well consistent with the characteristics of HVS, resulting in poor subjective quality and low conversion efficiency.
发明内容Contents of the invention
本发明的目的是提供一种与HVS特性一致的快速MPEG视频码流码率转换方法,在异质网络环境中传递主观质量更好的视频码流。The purpose of the present invention is to provide a fast MPEG video code stream code rate conversion method consistent with HVS characteristics, and to transmit video code streams with better subjective quality in a heterogeneous network environment.
为了实现上述目的,一种引入视觉模型的MPEG视频码流码率转换方法,包括步骤:In order to achieve the above object, a kind of MPEG video bit rate conversion method that introduces visual model, comprises steps:
对输入的码流进行部分解码;Partially decode the input stream;
DCT系数截断,去除高于截止频率的系数;DCT coefficient truncation to remove coefficients higher than the cutoff frequency;
码率控制,重新确定各宏块的量化因子;Rate control, re-determining the quantization factor of each macroblock;
再编码。Recode.
本发明在转换中巧妙地利用了Fovea视觉模型,有效地提高转换效率,产生主观质量相对更好低码率码流,并进一步减少了计算量。The present invention cleverly utilizes the Fovea visual model in the conversion, effectively improves the conversion efficiency, generates relatively better subjective quality and low code rate code stream, and further reduces the amount of calculation.
附图说明Description of drawings
图1是本发明的结构示意图;Fig. 1 is a structural representation of the present invention;
图2是8×8 DCT系数块的多分辨率频带表示。Figure 2 is a multi-resolution band representation of an 8 × 8 block of DCT coefficients.
具体实施方式Detailed ways
为了更好地理解本发明,首先对Fovea视觉模型给予说明。根据对HVS研究表明:人眼对于视觉信息的采样是非均匀的。一般情况下,人眼观看一幅图像时有一个注视点,可称为Fovea点,在该点处人眼具有最高感知清晰度。以该点为中心,向周围延伸人眼感知清晰度快速下降。依据这样的特性,人们给出可应用于视频图像编码的Fovea视觉模型:给定Fovea点,对于图像中的任意一点(x,y),它的截止频率(人眼的最大可感知频率)fc(x,y)由下面的公式确定:In order to better understand the present invention, the Fovea visual model is first explained. According to the research on HVS, it is shown that the sampling of visual information by human eyes is non-uniform. Generally, when human eyes watch an image, there is a fixation point, which can be called the Fovea point, and the human eyes have the highest perceived clarity at this point. Taking this point as the center, extending to the surroundings, the perceived sharpness of the human eye decreases rapidly. According to such characteristics, people give a Fovea visual model that can be applied to video image coding: Given a Fovea point, for any point (x, y) in the image, its cut-off frequency (the maximum perceivable frequency of the human eye) f c (x,y) is determined by the following formula:
d=(x-xf)2+(y-yf)2 d=(xx f )2 + (yy f ) 2
B[i,V]=min{r2:[fc(r,V)×8]=i,r∈Z+}B[i, V]=min{r 2 : [f c (r, V)×8]=i, r∈Z + }
其中,(xf,yf)代表图像中Fovea点坐标,V代表视点到图像的距离,模型参数k=13.75,R代表以Fovea点为中心的圆形区域的半径,对该区域给予最高感知清晰度(即fc=1.0)的编码。在图像中频率高于截止频率fc(x,y)的信息不能被人眼感知。Among them, (x f , y f ) represents the coordinates of the Fovea point in the image, V represents the distance from the viewpoint to the image, the model parameter k=13.75, and R represents the radius of the circular area centered on the Fovea point, which gives the highest perception Coding of sharpness (ie f c =1.0). Information with frequencies higher than the cut-off frequency f c (x, y) in the image cannot be perceived by the human eye.
把一帧图像分为8个区域,每个区域中具有相同的截止频率,不同的区域截止频率不同,截止频率取值范围是:
图1给出了本发明的结构示意图,图中缩写的意思是:VLD-变字长解码、VLC-变字长编码、DCT-离散余弦变换、IDCT-反离散余弦变换、Q-量化、IQ-反量化、MV-运动矢量、MC-运动补偿、FM-帧存储。鉴于快速级联像素域转换的体系结构具有计算量较小,结构灵活,便于扩展的优点,本发明基于该结构,并依据Fovea视觉模型进行了相应的改进。本发明主要由以下几个部分构成:Fig. 1 has provided the structural representation of the present invention, and abbreviation means among the figure: VLD-variable word length decoding, VLC-variable word length coding, DCT-discrete cosine transform, IDCT-inverse discrete cosine transform, Q-quantization, IQ - Dequantization, MV - Motion Vector, MC - Motion Compensation, FM - Frame Storage. In view of the fast cascaded pixel domain conversion architecture having the advantages of less calculation, flexible structure, and easy expansion, the present invention is based on this structure, and corresponding improvements are made according to the Fovea visual model. The present invention mainly consists of the following parts:
●部分解码● Partial decoding
对输入的码率为R1的MPEG视频流进行变字长解码(VLC),之后根据码流中的量化因子信息进行反量化(IQ1),得到每个8×8块DCT系数。Perform variable word length decoding (VLC) on the input MPEG video stream with a code rate of R 1 , and then perform inverse quantization (IQ1) according to the quantization factor information in the code stream to obtain DCT coefficients for each 8×8 block.
●DCT系数截断●DCT coefficient truncation
依据Fovea视觉模型,在8×8 DCT块内高于截止频率的系数不能被人主观视觉感知,如果将其去除,不会影响主观视觉质量,可以有效地提高转换效率。DCT系数截断模块就是为实现这一目的而加入的。According to the Fovea visual model, the coefficients higher than the cut-off frequency in the 8×8 DCT block cannot be perceived by human subjective vision. If they are removed, the subjective visual quality will not be affected, and the conversion efficiency can be effectively improved. The DCT coefficient truncation module is added for this purpose.
可以近似认为一个8×8块具有唯一截止频率,一般取8×8块的中心点为代表,由它的坐标计算该块的截止频率fc。一个8×8的DCT系数块可分成8个频带,构成多分辨率表示,如图2所示。对于任意一频带m,它的频率f(m)为:
●码率控制●Rate control
要把MPEG视频码流的码率由R1降为R2,就要运用码率控制模块重新确定各宏块的量化因子,根据量化因子对DCT系数重新量化。本发明依据Fovea视觉模型对原有的MPEG TM5码率控制方法进行改进,构成新的基于Fovea视觉模型的码率控制方法,其主要步骤如下:To reduce the bit rate of the MPEG video stream from R 1 to R 2 , it is necessary to use the bit rate control module to re-determine the quantization factor of each macroblock, and re-quantize the DCT coefficients according to the quantization factor. The present invention improves the original MPEG TM5 code rate control method according to the Fovea visual model, and forms a new code rate control method based on the Fovea visual model, and its main steps are as follows:
(1)图像帧级目标编码比特数分配(1) Allocation of image frame-level target coding bits
具体方法与TM5方法相同,不再详细阐述。The specific method is the same as the TM5 method and will not be described in detail.
(2)宏块级目标编码比特数分配(2) Allocation of macroblock-level target coding bits
假设一帧图像的编码比特数为R,在此图像中共有M个宏块,每个宏块中有N个8×8块。原有的TM5方法对每个宏块平均分配目标编码比特数,即对于任一个宏块k,它被分配的目标编码比特数为
其中 表示宏块k内的N个8×8块的截止频率的平方和,为图像内所有8×8块的截止频率的平方和。in Indicates the sum of squares of the cut-off frequencies of N 8×8 blocks within macroblock k, is the sum of the squares of the cutoff frequencies of all 8×8 blocks in the image.
(3)码率控制(3) Bit rate control
根据虚拟缓冲区(VBV)的满度,确定各宏块的参考量化因子Qi。此处采用的方法与TM5相同,不再详细阐述。According to the fullness of the virtual buffer zone (VBV), the reference quantization factor Q i of each macroblock is determined. The method adopted here is the same as that of TM5, and will not be described in detail here.
(4)自适应量化(4) Adaptive quantization
在TM5方法中,根据宏块的空间活动性来自适应确定它的最终量化因子,而宏块的空间活动性是该宏块内所有8×8块空间活动性的最小值,其中8×8块空间活动性是由块内的信息变化率V来确定,即:In the TM5 method, the final quantization factor is adaptively determined according to the spatial activity of the macroblock, and the spatial activity of the macroblock is the minimum value of the spatial activity of all 8×8 blocks in the macroblock, where the 8×8 block Spatial activity is determined by the rate of information change V within the block, namely:
其中pi表示块内第i个像素的亮度值。在压缩域上这样的信息无法得到,为此本发明提出了DCT块空间活动性V_DCT的计算方法:where p i represents the brightness value of the i-th pixel in the block. Such information cannot be obtained in the compressed domain, so the present invention proposes a calculation method of DCT block spatial activity V_DCT:
其中,此DCT块内低于此块截止频率的所有交流系数的个数为N,Fi表示这N个系数中的一个的值。Wherein, the number of all AC coefficients in the DCT block lower than the cutoff frequency of the block is N, and F i represents the value of one of the N coefficients.
根据宏块内所有8×8 DCT块空间活动性宏块的空间活动性,确定该宏块的空间活动性(经规范化后)NVi,那么该宏块的最终量化因子mqi为:According to the spatial activity of all 8 × 8 DCT block spatial activity macroblocks in the macroblock, determine the spatial activity (after normalization) NV i of this macroblock, then the final quantization factor mq i of this macroblock is:
mqi=Qi×NVi mq i =Q i ×NV i
●再编码●Recode
根据各宏块的最终量化因子mqi对该宏块内的所有DCT块的系数进行再量化(Q2),之后再进行变字长编码(VLC),生成码率为R2的MPEG视频码流。Requantize (Q2) the coefficients of all DCT blocks in the macroblock according to the final quantization factor mq i of each macroblock, and then perform variable word length coding (VLC) to generate an MPEG video stream with a code rate of R 2 .
●误差漂移补偿●Error drift compensation
以上过程即可以实现MPEG视频码流转换。然而由于对DCT系数的再量化(Q2)会引起编码端和解码端的参考图像的不匹配,导致误差漂移,影响转换后所生成的码流的图像质量。为此需要误差漂移补偿模块来避免误差漂移。The above process can realize MPEG video code stream conversion. However, the requantization (Q2) of the DCT coefficients will cause a mismatch between the reference images at the encoding end and the decoding end, resulting in error drift and affecting the image quality of the converted code stream. To this end, an error drift compensation module is needed to avoid error drift.
把再量化前的DCT系数与再量化后的DCT系数的差值进行IDCT变换,得到像素域系数,送入帧存储器中。然后根据部分解码所得到的运动矢量(MV)信息,在像素域进行运动补偿(MC),并将所得的预测值利用DCT变换转换成DCT系数,反馈回去与原有的预测帧的残差DCT系数相加,从而实现误差漂移补偿。IDCT transforms the difference between the DCT coefficients before requantization and the DCT coefficients after requantization to obtain pixel domain coefficients and send them to the frame memory. Then, according to the motion vector (MV) information obtained by partial decoding, motion compensation (MC) is performed in the pixel domain, and the obtained prediction value is converted into DCT coefficients by DCT transformation, and fed back to the residual DCT of the original prediction frame. The coefficients are summed to achieve error drift compensation.
由于要进行IDCT及DCT变换,因此与DCT域转换相比,运算量较大。但是根据Fovea视觉模型,对一部分DCT系数可以不予计算,据此本发明提出了DCT/IDCT快速计算方法,显著减小DCT/IDCT计算量。原有的DCT及IDCT计算公式分别为:Compared with the DCT domain conversion, the calculation amount is relatively large due to the IDCT and DCT conversion. However, according to the Fovea visual model, some DCT coefficients may not be calculated. Accordingly, the present invention proposes a DCT/IDCT fast calculation method, which significantly reduces the DCT/IDCT calculation amount. The original DCT and IDCT calculation formulas are:
设一个8×8块的截止频率为
最后,需要指出在本发明中,Fovea点的选择可以由用户通过鼠标用交互的方式实现。Finally, it should be pointed out that in the present invention, the selection of the Fovea point can be realized by the user in an interactive manner through the mouse.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 02157889 CN1219403C (en) | 2002-12-20 | 2002-12-20 | MPEG video code rate conversion method based on visual model |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 02157889 CN1219403C (en) | 2002-12-20 | 2002-12-20 | MPEG video code rate conversion method based on visual model |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1510923A CN1510923A (en) | 2004-07-07 |
| CN1219403C true CN1219403C (en) | 2005-09-14 |
Family
ID=34236740
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 02157889 Expired - Fee Related CN1219403C (en) | 2002-12-20 | 2002-12-20 | MPEG video code rate conversion method based on visual model |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN1219403C (en) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101527846B (en) * | 2008-12-19 | 2010-11-03 | 无锡云视界科技有限公司 | H.264 variable bit rate control method based on Matthew effect |
| CN102186083A (en) * | 2011-05-12 | 2011-09-14 | 北京数码视讯科技股份有限公司 | Quantization processing method and device |
| JP5768565B2 (en) * | 2011-07-28 | 2015-08-26 | 富士通株式会社 | Moving picture coding apparatus, moving picture coding method, and moving picture coding program |
| CN102752598A (en) * | 2012-07-09 | 2012-10-24 | 北京博雅华录视听技术研究院有限公司 | Fast adaptive code rate control method |
-
2002
- 2002-12-20 CN CN 02157889 patent/CN1219403C/en not_active Expired - Fee Related
Also Published As
| Publication number | Publication date |
|---|---|
| CN1510923A (en) | 2004-07-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP1529401B1 (en) | System and method for rate-distortion optimized data partitioning for video coding using backward adaptation | |
| CN1192629C (en) | System and method for improved fine granular scalable video using base layer coding information | |
| US6084908A (en) | Apparatus and method for quadtree based variable block size motion estimation | |
| CN1196341C (en) | System and method for encoding and decoding enhancement layer data using base layer quantization data | |
| US6895050B2 (en) | Apparatus and method for allocating bits temporaly between frames in a coding system | |
| CA2295689C (en) | Apparatus and method for object based rate control in a coding system | |
| US7460725B2 (en) | System and method for effectively encoding and decoding electronic information | |
| CN1251511C (en) | Method for generating scalable coded video bitstream with constant quality | |
| CN1274446A (en) | Appts. and method for macroblock based rate control in coding system | |
| CN1420691A (en) | Method and system for control of bit rate based on object | |
| KR19990086789A (en) | Method of selecting re-quantization step size in which bit rate changes rapidly and bit rate control method using the same | |
| CN101098473A (en) | Picture coding method and apparatus | |
| US20070165717A1 (en) | System and method for rate-distortion optimized data partitioning for video coding using parametric rate-distortion model | |
| CN1266335A (en) | Device and method for adjusting bit rate in multiplex system | |
| JPH11122617A (en) | Image compression | |
| CN1186933C (en) | Parallel image sequence bit rate controlling method for digital TV video coder | |
| JP2006500849A (en) | Scalable video encoding | |
| CN1219403C (en) | MPEG video code rate conversion method based on visual model | |
| CN1285215C (en) | Frame rate adjustment method of video communication system | |
| JP2003535496A (en) | Method and apparatus for encoding or decoding an image sequence | |
| EP1484925A2 (en) | Method and device for compressing image data | |
| Omaki et al. | Embedded zerotree wavelet based algorithm for video compression | |
| JPH0970038A (en) | Image data processing method | |
| CN1301015C (en) | Adaptive Reduced Frame Rate Video Transcoding System | |
| Young | Software CODEC algorithms for desktop videoconferencing |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| EE01 | Entry into force of recordation of patent licensing contract |
Assignee: Dongguan Santai Electric Appliance Co. Ltd Assignor: Institute of Computing Technology, Chinese Academy of Sciences Contract fulfillment period: 2008.5.10 to 2014.5.10 Contract record no.: 2009440000979 Denomination of invention: Visual model induced MPEG video code string rate inversion method Granted publication date: 20050914 License type: Exclusive license Record date: 20090730 |
|
| LIC | Patent licence contract for exploitation submitted for record |
Free format text: EXCLUSIVE LICENSE; TIME LIMIT OF IMPLEMENTING CONTACT: 2008.5.10 TO 2014.5.10; CHANGE OF CONTRACT Name of requester: DONGGUAN SANTAI ELECTRICAL APPLIANCES CO., LTD. Effective date: 20090730 |
|
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20050914 Termination date: 20191220 |