CN1157059C

CN1157059C - Movement character combined video quality evaluation method

Info

Publication number: CN1157059C
Application number: CNB021036047A
Authority: CN
Inventors: 沈兰荪; 田栋; 姚志恒
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2002-01-29
Filing date: 2002-01-29
Publication date: 2004-07-07
Anticipated expiration: 2022-01-29
Also published as: CN1359235A

Abstract

The present invention relates to a video quality evaluating method combining motion characteristics, which belongs to the field of computer digital video processing. The evaluating method is a comprehensive evaluating method combining the motion characteristics of the sequence of a video image, considering the visual characteristics of human eyes and organically combining the time domain characteristics and the space domain characteristics of a video. The method provides a describing method of the intense degree of motion and provides a measuring method of video definition and fluency based on the describing method, and comprehensive evaluating results are given combining the definition with the fluency. The method orderly comprises: video signals are read in from a video capture card by a computer and a video capture interval T is stored; a subprogram of video image compressing processing is carried out; a reconstruct subprogram of the video image is carried out; a subprogram for calculating the intense degree of the motion of a video sequence is carried out; comprehensive evaluating results are calculated after a subprogram for video quality comprehensive evaluation is carried out; the results can be compared with the visual feelings of the human eyes when the video image is watched from a display. Experiments prove that the results comprehensively evaluated are basically consistent with the visual feelings of the human eyes.

Description

A Video Quality Evaluation Method Combining Motion Features

技术领域technical field

本发明涉及计算机数字视频处理领域，设计了一种结合运动特征的视频质量评价方法。The invention relates to the field of computer digital video processing, and designs a video quality evaluation method combined with motion features.

背景技术Background technique

视频质量评价是图像/视频信息工程的基础之一，如在视频通信中，将被摄物体的视频信息传输到接收端，恢复出可接受的视频，其中要经过光电变换、压缩处理、传输、记录及其他变换等过程，所有这些技术的优劣都会归集到视频质量的评价。Video quality evaluation is one of the foundations of image/video information engineering. For example, in video communication, the video information of the subject is transmitted to the receiving end to restore an acceptable video, which requires photoelectric conversion, compression processing, transmission, Recording and other transformation processes, the pros and cons of all these technologies will be included in the evaluation of video quality.

对于以人眼为最终信宿的图像/视频信息来说，对其的评价应和人眼的主观感受相一致。人类视觉处理的机制非常复杂，到目前为止，还没有被真正理解和掌握。但是，人们也发现了一些视觉现象，这些现象已对人们研究视频质量的评价产生了影响。具体而言，这些视觉特性有多通道结构、视觉阈值及掩蔽。For the image/video information whose final destination is the human eye, its evaluation should be consistent with the subjective feeling of the human eye. The mechanism of human visual processing is very complicated, and so far, it has not been really understood and mastered. However, people have also discovered some visual phenomena, which have affected the evaluation of human research video quality. Specifically, these visual properties have multichannel structures, visual thresholds, and masking.

1.多通道结构与视觉阈值1. Multi-channel structure and visual threshold

人的视觉系统是一个多通道结构，它把输入的图像分解成不同的感觉分量。每个感觉通道都有其自己的阈值，称之为视觉阈值。给定通道中的激励，如果其值低于通道所对应的视觉阈值，人眼就感觉不到该激励。The human visual system is a multi-channel structure that decomposes an input image into different sensory components. Each sensory channel has its own threshold, called the visual threshold. A stimulus in a given channel is invisible to the human eye if its value is below the visual threshold for that channel.

2.掩蔽效应2. Masking effect

当存在多个激励时，它们之间就会互相干扰，导致视觉阈值发生变化，称之为视觉掩蔽效应。对编码传输的视频图像，原图像是掩蔽者，编码传输损伤是目标。实际视频图像千变万化，编码后的损伤分量的背景也各式各样，所以损伤分量的可见性也是变化的。When there are multiple stimuli, they will interfere with each other, resulting in changes in the visual threshold, which is called the visual masking effect. For coded and transmitted video images, the original image is the masker, and the coded transmission damage is the target. Actual video images are ever-changing, and the backgrounds of encoded impairment components are also varied, so the visibility of impairment components also varies.

视觉阈值的存在使得低于阈值的损伤不被觉察，掩蔽的存在则使视觉阈值提升，从而使不可见的损伤增大。这是研究基于人眼视觉系统的视频质量度量的一个重要出发点。换句话说，如果设法使损伤出现在人眼看不见的地方，也就改善了图像的质量。The existence of the visual threshold makes the damage below the threshold undetectable, and the existence of masking increases the visual threshold, thereby increasing the invisible damage. This is an important starting point for studying video quality metrics based on the human visual system. In other words, if you manage to make the damage appear where the human eye can't see it, you also improve the image quality.

掩蔽有各种形式。人眼对损伤的敏感度在非常亮或非常暗的区域下降，称为对比掩蔽；人眼对损伤的敏感度在图像空间域变化大的区域比变化小的区域低，称为纹理掩蔽；内容随时间变化大的图像块，人眼对其损伤的敏感度低，这是运动掩蔽；场景切换后的瞬间，人眼对损伤的可见性下降，这是切换掩蔽。Masking comes in various forms. The human eye's sensitivity to impairment decreases in very bright or very dark areas, called contrast masking; the human eye's sensitivity to impairment is lower in areas with large changes in the image spatial domain than in areas with small changes, called texture masking; content For an image block that changes greatly over time, the human eye is less sensitive to its damage, which is motion masking; immediately after the scene is switched, the human eye’s visibility of the damage decreases, which is switch masking.

目前人们对视频质量的评价主要是基于对静止图像的评价来进行，通过平均峰值信噪比和帧频来度量的。帧频即每秒钟传输的帧数；平均峰值信噪比定义为：At present, people's evaluation of video quality is mainly based on the evaluation of still images, which is measured by the average peak signal-to-noise ratio and frame frequency. The frame rate is the number of frames transmitted per second; the average peak signal-to-noise ratio is defined as:

$\overset{&OverBar; &OverBar;}{PSNR PSNR} = = \frac{11}{P P} {Σ Σ}_{k k = = 00}^{P P - - 11} PSN PSN {R R}_{k k}$

其中in

${PSNR PSNR}_{k k} = = 1010 \cdot \cdot {log log}_{1010} \frac{255255 * * 255255 * * N N * * M m}{{Σ Σ}_{i i = = 00}^{N N - - 11} {Σ Σ}_{j j = = 00}^{M m - - 11} {(({f f}_{k k} ((i i,, j j)) - - {f f}_{k k}^{' '} ((i i,, j j))))}^{22}},,$

f_k(i，j)和f′_k(i，j)分别是第k帧原图像和重建图像的灰度，M，N分别是图像的宽和高。f _k (i, j) and f′ _k (i, j) are the gray levels of the original image and the reconstructed image of the kth frame respectively, and M and N are the width and height of the image respectively.

平均峰值信噪比能够客观的反映视频的失真程度，但并没有考虑人眼的视觉特性，而且帧频和平均峰值信噪比是相互独立的反映视频时间域和空间域特性的，这种评价方法把视频图像的时间域特性和空间域特性割裂开来。因而这种评价结果往往和人的视觉感受不一致。The average peak signal-to-noise ratio can objectively reflect the degree of video distortion, but it does not consider the visual characteristics of the human eye, and the frame rate and average peak signal-to-noise ratio are independent of each other to reflect the characteristics of the video time domain and space domain. The method separates the time-domain characteristics and space-domain characteristics of video images. Therefore, this evaluation result is often inconsistent with people's visual experience.

发明内容Contents of the invention

为了克服目前评价方法的不足，本发明设计了一种结合运动特征的视频质量评价方法。这种评价方法考虑了人眼的视觉特性，并且将视频的空间域特性和时间域特性综合起来考虑，其评价结果能够和人眼的视觉感受一致。本发明的技术思路特征在于：In order to overcome the shortcomings of the current evaluation methods, the present invention designs a video quality evaluation method combined with motion features. This evaluation method takes into account the visual characteristics of the human eye, and considers the spatial domain characteristics and time domain characteristics of the video, and its evaluation results can be consistent with the visual experience of the human eye. Technical idea of the present invention is characterized in that:

1、提出了视频清晰度和视频流畅度两种定义，并将其作为视频质量评价的两种度量方法。所谓清晰度是指各帧图像的清楚程度；流畅度是指一段视频的连续程度。1. Two definitions of video clarity and video fluency are proposed, and they are used as two measurement methods for video quality evaluation. The so-called sharpness refers to the clarity of each frame image; fluency refers to the continuous degree of a video.

2、提出了视频图像运动剧烈程度的描述方法。其描述分为两部分：(1)一帧图像中各图像块的运动剧烈程度分布，我们称之为运动的空间分布特征，将其应用于视频清晰度评价；(2)整帧视频图像的运动剧烈程度，我们称之为运动的时间分布特征，将其应用于视频流畅性评价。2. A description method for the intensity of video image motion is proposed. Its description is divided into two parts: (1) the intensity distribution of each image block in a frame of image, we call it the spatial distribution feature of motion, and apply it to the evaluation of video clarity; (2) the distribution of the whole frame of video image The degree of motion intensity, which we call the temporal distribution of motion, is applied to video fluency evaluation.

3、考虑了人眼的视觉特性。人眼对视频清晰度和流畅度的感受受到视频画面运动剧烈程度的影响。对于某一帧视频图像，其运动剧烈的部分往往引起人眼的注意，属于人眼感兴趣区。而当一段视频的整帧图像运动较剧烈时，图像清晰度对人眼的刺激将相对减弱，而画面流畅度的作用将相对增强。3. Taking into account the visual characteristics of the human eye. The human eye's perception of video clarity and fluency is affected by the intensity of video image motion. For a certain frame of video image, the part with intense movement often attracts the attention of human eyes and belongs to the area of interest of human eyes. However, when the whole frame of a video moves violently, the stimulation of image clarity to human eyes will be relatively weakened, while the effect of image fluency will be relatively enhanced.

4、将视频图像的清晰度和流畅度有机的结合起来，提出了计算视频质量综合评价指标的方法。4. Combining the clarity and fluency of the video image organically, a method for calculating the comprehensive evaluation index of video quality is proposed.

本发明的技术方案参见图1、图2。它结合了视频图像的运动特征以及人眼的视觉特性，对经过计算机处理后的视频图像进行客观的评价。该方案包括有摄像头(1)将获取的目标物视频图像序列的图像光信号转化为电信号，由采集卡(2)将来自于摄像头的视频序列数字化并输入到计算机处理器中，其特征在于它还包括在计算机处理器(3)中设置的将视频图像的清晰度和流畅度有机的结合起来的计算视频质量综合评价指标的方法，该方法依次包括下述步骤：Refer to Fig. 1 and Fig. 2 for the technical solution of the present invention. It combines the motion characteristics of the video image and the visual characteristics of the human eye to objectively evaluate the video image processed by the computer. The solution includes a camera (1) converting the image light signal of the acquired video image sequence of the target object into an electrical signal, and the video sequence from the camera is digitized and input into the computer processor by the acquisition card (2), which is characterized in that It also includes the method for calculating the comprehensive evaluation index of video quality which is set in the computer processor (3) by organically combining the clarity and fluency of the video image, and the method includes the following steps in turn:

1)、计算机从视频采集卡读入视频信号，并保存视频采集间隔T；1), the computer reads in the video signal from the video capture card, and saves the video capture interval T;

2)、进入完成对采集进入计算机的原始视频帧进行运动补偿、变换和编码等处理，以压缩有效冗余的视频图像压缩处理子程序；2), enter and complete processing such as motion compensation, conversion and encoding to the raw video frame that collects and enters computer, to compress effectively redundant video image compression processing subroutine;

3)、进入对压缩后的视频码流进行解码，重建视频序列图像的视频图像重建子程序；3), enter the video image reconstruction subroutine that the compressed video code stream is decoded and the video sequence image is reconstructed;

4)、根据视频清晰度和视频流畅度两种定义，并将其作为视频质量评价的两种度量方法：4), according to the two definitions of video clarity and video fluency, and use them as two measurement methods for video quality evaluation:

先进入计算视频序列运动剧烈程度子程序，将每一帧视频图像分成16×16的图像块，计算每一个图像块的运动剧烈程度和一整帧图像的运动剧烈程度；First enter the subroutine for calculating the intensity of motion of the video sequence, divide each frame of video image into 16×16 image blocks, and calculate the intensity of motion of each image block and the intensity of motion of a whole frame of images;

再进入视频质量综合评价子程序，依据前述的运动剧烈程度，先计算视频序列的清晰度和流畅度，然后计算综合评价结果；Enter the video quality comprehensive evaluation subroutine again, according to the aforementioned degree of motion intensity, first calculate the clarity and fluency of the video sequence, and then calculate the comprehensive evaluation result;

5)、从考虑人眼对视频清晰度和流畅度的感受受到视频画面运动剧烈程度的影响出发，对4)的综合评价结果输出，可以和人眼从显示器观看的视觉感受相比较。5) Considering that the human eye's perception of video clarity and fluency is affected by the intensity of the video picture motion, the output of the comprehensive evaluation result of 4) can be compared with the visual experience of the human eye watching from the display.

根据前述本发明分两步计算视频的运动剧烈程度，第一步：计算每一个图像块的运动剧烈程度，其计算可以通过两种方法：According to the foregoing invention, calculate the intensity of motion of the video in two steps, the first step: calculate the intensity of motion of each image block, which can be calculated in two ways:

第一种是根据图像块的运动矢量来计算，其计算公式为：The first one is calculated according to the motion vector of the image block, and its calculation formula is:

${MA MA}_{k k} ((i i,, j j)) = = \sqrt{{Δx Δx}_{k k}^{22} ((i i,, j j)) + + {Δy Δy}_{k k}^{22} ((i i,, j j))}$

式中MA_k(i，j)表示第k帧视频图像中第(i，j)个图像块的运动剧烈程度，(Δx_k，Δy_k)为该块的运动矢量。第二种利用相邻两帧图像的像素灰度差值来计算，其计算公式为：In the formula, MA _k (i, j) represents the motion intensity of the (i, j)th image block in the kth frame of video image, and (Δx _k , Δy _k ) is the motion vector of the block. The second is calculated by using the pixel gray level difference between two adjacent frames of images, and its calculation formula is:

${MA MA}_{k k} ((i i,, j j)) = = EXP EXP ((\frac{11}{255255 \cdot &Center Dot; {N N}_{p p}} \underset{((m m,, n no)) &Element; &Element; block block ((i i,, j j))}{Σ Σ} | | {L L}_{k k} ((m m,, n no)) - - {L L}_{k k - - 11} ((m m,, n no)) | |))$

式中block(i，j)表示一帧图像中第(i，j)个块，L_k(m，n)和L_k-1(m，n)分别表示第k帧和第k-1帧图像(m，n)处像素的灰度值，N_p表示一个图像块中像素的个数。In the formula, block(i, j) represents the (i, j)th block in a frame of image, and L _k (m, n) and L _k-1 (m, n) represent the kth frame and the k-1th frame respectively The gray value of the pixel at image (m, n), N _p represents the number of pixels in an image block.

第二步：计算一整帧视频图像的运动剧烈程度，其计算公式为：Step 2: Calculate the intensity of motion of a whole frame of video image, the calculation formula is:

${MA MA}_{k k} = = \frac{11}{{N N}_{MB MB}} \underset{i i}{Σ Σ} \underset{j j}{Σ Σ} {MA MA}_{k k} ((i i,, j j))$

式中MA_k表示第k帧视频图像的运动剧烈程度，N_MB表示一帧图像中图像块的个数，MA_k(i，j)的意义和前述中的相同。In the formula, MA _k represents the motion intensity of the kth frame of video image, N _MB represents the number of image blocks in one frame of image, and the meaning of MA _k (i, j) is the same as above.

根据前述本发明在计算视频图像清晰度时，假设输入到计算机的视频序列是以恒定的高速帧频采集得到的，采集间隔为T；一般情况下计算视频图像清晰度。不是所有帧都被编码(处理)输出。k表示被编码(处理)的图像帧编号，N表示被编码(处理)图像的总帧数。第k帧和第k-1帧之间相隔时间为FT_k·T。我们以静止图像的峰值信噪比为基础，考虑人眼的视觉特性，给运动较剧烈的图像块的峰值信噪比赋以较高的权重，运动较小的宏块以较小的权重。首先计算一帧视频图像的清晰度，其计算公式为：According to the aforementioned present invention, when calculating the definition of video images, it is assumed that the video sequence input to the computer is collected at a constant high-speed frame rate, and the acquisition interval is T; in general, the definition of video images is calculated. Not all frames are encoded (processed) for output. k represents the image frame number to be coded (processed), and N represents the total number of frames of the coded (processed) image. The time interval between the kth frame and the k-1th frame is FT _k ·T. Based on the peak signal-to-noise ratio of still images, we consider the visual characteristics of the human eye, assign higher weights to the peak signal-to-noise ratio of image blocks with more intense motion, and assign smaller weights to macroblocks with less motion. First calculate the sharpness of a frame of video image, the calculation formula is:

${SS SS}_{k k} = = 1010 \cdot \cdot {log log}_{1010} \frac{255255 \cdot \cdot 255255}{\frac{11}{{N N}_{MB MB}} \underset{i i,, j j}{Σ Σ} {MA MA}_{k k} ((i i,, j j)) \cdot \cdot {Diff Diff}_{k k} ((i i,, j j))}$

式中SS_k表示一帧视频图像清晰度，N_MB和MA_k(i，j)的意义和7中相同，In the formula, SS _k represents the sharpness of a frame of video image, and the meanings of N _MB and MA _k (i, j) are the same as in 7,

${Diff Diff}_{k k} ((i i,, j j)) = = \underset{((m m,, n no)) &Element; &Element; block block ((i i,, j j))}{Σ Σ} {(({L L}_{k k,, Input input} ((m m,, n no)) - - {L L}_{k k,, Output output} ((m m,, n no))))}^{22},,$

其中L_k，Input(m，n)和L_k，Output(m，n)分别为原始采集的视频和处理后重建视频的(m，n)处像素灰度值，然后计算一段视频的清晰度，其计算公式为：Among them, L _{k, Input} (m, n) and L _{k, Output} (m, n) are the pixel gray value at (m, n) of the original captured video and the reconstructed video after processing, and then calculate the definition of a video , whose calculation formula is:

$PS P.S. = = \frac{11}{N N} {Σ Σ}_{k k = = 11}^{N N} ((\frac{11}{{MA MA}_{k k}} \cdot &Center Dot; {FT FT}_{k k} \cdot \cdot {SS SS}_{k k}))$

式中PS表示一段(N帧)视频的清晰度，PS越大说明视频图像画面越清晰。此计算公式包含两方面的含义：一方面，运动较为剧烈的图像帧的清晰度在整段视频清晰度评价中的作用相对较弱；另一方面，视频重放中占用时间较长的图像帧的清晰度在整段视频清晰度评价中的作用相对较强，这是符合人眼的视觉感受的。In the formula, PS represents the resolution of a section (N frames) of video, and the larger the PS, the clearer the video image is. This calculation formula contains two meanings: on the one hand, the sharpness of image frames with more intense motion plays a relatively weak role in the evaluation of the clarity of the entire video; on the other hand, the image frames that take up a long time in video playback The sharpness of the video plays a relatively strong role in the evaluation of the sharpness of the entire video, which is in line with the visual experience of the human eye.

根据前述本发明在计算一段视频的流畅度时，考虑人眼对于运动较剧烈的图像帧，无法分辨图像的细节信息，这时图像的清晰度对人眼的刺激变小，而运动的连续性对人眼的刺激变大，因而需要在运动剧烈的时候提高帧频。对于运动较缓和的视频帧，人眼能够分辨图像的细节信息，这时需提高每帧视频图像的清晰度，并不需要更高的帧频。根据人眼的这一视觉特性计算视频流畅度，其计算公式为：According to the aforementioned present invention, when calculating the fluency of a section of video, it is considered that the human eye cannot distinguish the detailed information of the image for the image frame with more intense motion. At this time, the clarity of the image is less stimulating to the human eye, and the continuity of motion The stimulation to the human eye becomes greater, so it is necessary to increase the frame rate when the movement is intense. For video frames with moderate motion, human eyes can distinguish the details of the image. At this time, it is necessary to improve the definition of each frame of video image, and a higher frame rate is not required. The video fluency is calculated according to this visual characteristic of the human eye, and the calculation formula is:

$PT PT = = \frac{11}{N N} {Σ Σ}_{k k = = 11}^{N N} (({MA MA}_{k k} \cdot &Center Dot; {FT FT}_{k k}))$

式中PT表示一段(N帧)视频的流畅度，PT越小说明视频越流畅。In the formula, PT represents the smoothness of a section (N frames) of video, and the smaller the PT, the smoother the video.

根据前述本发明在计算一段视频图像的综合评价指标值时，其计算公式为：When calculating the comprehensive evaluation index value of a section of video image according to the aforementioned present invention, its calculation formula is:

$P P = = {W W}_{S S} \cdot &Center Dot; PS P.S. + + {W W}_{T T} \cdot &Center Dot; \frac{11}{PT PT}$

式中P表示一段视频图像的质量综合评价值。PS为清晰度，PT为流畅度，W_S和W_T为清晰度和流畅度在综合评价值中的权重。In the formula, P represents the quality comprehensive evaluation value of a video image. PS is clarity, PT is fluency, _WS and W _T are the weights of clarity and fluency in the comprehensive evaluation value.

依据大量的实验，我们取W_S为0.55，W_T为0.45，这样的取值其综合评价结果与人眼的视觉感受基本一致。Based on a large number of experiments, we set _WS as 0.55 and W _T as 0.45. The comprehensive evaluation results of such values are basically consistent with the visual perception of human eyes.

附图说明Description of drawings

图1是视频质量综合评价系统框图。Figure 1 is a block diagram of a video quality comprehensive evaluation system.

图2是视频质量综合评价系统主程序流程图。Figure 2 is a flow chart of the main program of the video quality comprehensive evaluation system.

图3是视频压缩处理子程序流程图。Figure 3 is a flow chart of the video compression processing subroutine.

图4是视频重建子程序流程图。Figure 4 is a flow chart of the video reconstruction subroutine.

图5是计算视频运动剧烈程度子程序流程图。Fig. 5 is a flow chart of the subroutine for calculating the intensity of video motion.

图6是计算视频综合质量评价指标值子程序流程图。Fig. 6 is a flowchart of a subroutine for calculating video comprehensive quality evaluation index values.

图7是利用运动矢量法计算得到的150帧视频序列运动剧烈程度分布图，其中横轴表示视频帧序号，纵轴表示视频帧的运动剧烈程度。Fig. 7 is a distribution diagram of motion intensity of 150 frames of video sequences calculated by the motion vector method, wherein the horizontal axis represents the sequence number of the video frame, and the vertical axis represents the motion intensity of the video frame.

图8是利用绝对差值法计算得到的150帧视频序列运动剧烈程度分布图，其中横轴表示视频帧序号，纵轴表示视频帧的运动剧烈程度。FIG. 8 is a distribution diagram of motion intensity of 150 frames of video sequences calculated by using the absolute difference method, wherein the horizontal axis represents the serial number of the video frame, and the vertical axis represents the motion intensity of the video frame.

图9是两个不同的实例中视频序列各帧清晰度分布图，其中横轴表示视频帧的序号，纵轴表示视频帧的清晰度。图中【—■—Variable Frame Rate】表示以可变帧频进行播放的实例，【—×—Constant Frame Rate】表示以恒定帧频播放的实例。FIG. 9 is a distribution diagram of the resolution of each frame of a video sequence in two different examples, wherein the horizontal axis represents the serial number of the video frame, and the vertical axis represents the resolution of the video frame. In the figure [—■—Variable Frame Rate] indicates an example of playing with a variable frame rate, and [—×—Constant Frame Rate] indicates an example of playing with a constant frame rate.

图中1、摄像头，2、视频采集卡，3、计算机数字视频处理器，4、输出缓存，5、视频质量评价，6、显示器，7、评价结果。In the figure, 1. camera, 2. video capture card, 3. computer digital video processor, 4. output buffer, 5. video quality evaluation, 6. display, 7. evaluation result.

表1：两个不同实例中被压缩编码处理的视频帧序号，第一组表示以恒定帧频播放，第二组表示以可变帧频播放。Table 1: Video frame numbers processed by compression encoding in two different examples, the first group represents playback at a constant frame rate, and the second group represents playback at a variable frame rate.

具体实施方式Detailed ways

在图1的视频质量综合评价系统框图中，摄像头和视频采集卡都是市售的，主要完成采集视频序列，将目标物体的光学图像转换为电信号图像输入到计算机，便于计算机处理、传输等操作；计算机视频处理主要是对输入的视频图像序列进行压缩编码等处理；处理后的视频图像输出到缓存器，便于显示；显示器是视频序列的输出设备，人眼可以通过显示器观看视频图像序列，人眼的视觉感受可以用来和视频质量综合评价结果相比较；视频质量综合评价是对计算机处理后的重建视频图像进行评价，输出客观的评价结果，其结果可和人眼从显示器所看视频序列的视觉感受相比较。In the block diagram of the video quality comprehensive evaluation system in Figure 1, the camera and the video capture card are commercially available, mainly to complete the collection of video sequences, convert the optical image of the target object into an electrical signal image, and input it to the computer, which is convenient for computer processing and transmission. Operation; computer video processing is mainly to compress and encode the input video image sequence; the processed video image is output to the buffer for easy display; the display is the output device of the video sequence, and the human eye can watch the video image sequence through the display. The visual experience of human eyes can be used to compare with the results of comprehensive evaluation of video quality; comprehensive evaluation of video quality is to evaluate the reconstructed video images after computer processing, and output objective evaluation results, which can be compared with the results of video viewed by human eyes from the monitor. The visual perception of the sequence is compared.

视频质量评价主要通过软件来实现。下面结合实例详细描述视频质量评价的过程。Video quality evaluation is mainly realized by software. The following describes the process of video quality evaluation in detail with examples.

我们已由摄像头和视频采集卡得到了150帧视频序列图像，保存在计算机的硬盘。首先对这一段视频进行压缩编码处理，压缩后的视频码流重建后和原始视频相比，一定有一定的失真。现在对压缩后再重建的视频序列按本发明设计的方法进行综合评价。We have obtained 150 frames of video sequence images from the camera and video capture card, and saved them in the hard disk of the computer. Firstly, the video is compressed and coded. Compared with the original video, the compressed video code stream must have certain distortion after reconstruction. Now the video sequence reconstructed after compression is comprehensively evaluated according to the method designed by the present invention.

第一步：计算视频各帧的运动剧烈程度。分别用本发明提出的两种描述运动剧烈程度的方法进行计算，图7和图8分别是用两种方法得到的150帧视频序列的运动剧烈程度分布图，其中横轴表示视频帧序号，纵轴表示视频帧的运动剧烈程度。Step 1: Calculate the intensity of motion in each frame of the video. Calculate with two kinds of methods that the present invention proposes to describe the intensity of motion respectively, Fig. 7 and Fig. 8 are respectively the distribution figure of intensity of motion of 150 frame video sequences obtained by two kinds of methods, wherein horizontal axis represents video frame sequence number, vertical axis The axis represents how violent the motion of the video frame is.

可以看出，图7和图8相似，这说明用这两种方法都能够客观描述图像的运动剧烈程度。而且可看出从第60帧到110帧，图像运动得相对剧烈，其他各帧运动相对缓和，这与人眼观察到的实际情况相吻合。It can be seen that Fig. 7 is similar to Fig. 8, which shows that both methods can objectively describe the intensity of motion of the image. And it can be seen that from the 60th frame to the 110th frame, the image moves relatively violently, and the other frames move relatively gently, which is consistent with the actual situation observed by the human eye.

第二步：计算视频序列清晰度。为了说明本发明所设计的方法对视频评价的有效性，这里设计两组实例以便进行比较。Step 2: Calculate the definition of the video sequence. In order to illustrate the effectiveness of the method designed in the present invention for video evaluation, two groups of examples are designed here for comparison.

首先说明一下人眼对待视频序列图像的感受。当我们人眼在观看一段视频时，对于视频画面运动较缓慢的时间段，人眼能够分辨图像画面的细节信息，也会对图像的细节感兴趣，由于运动缓和，前后帧的差别不大，所以较低的帧频人眼是可以接受的；而对视频画面运动剧烈的时间段来说，人眼无法分辨图像的细节信息，而对运动的连续性提出了更高的要求，所以这时提高帧频给人眼的视觉感受更好一些。因此，如果视频序列的播放帧频能够根据画面的运动剧烈程度变化，那么人眼的视觉感受要比以恒定的帧频播放时更好。First, explain how the human eye perceives video sequence images. When our human eyes are watching a video, the human eye can distinguish the details of the image and is also interested in the details of the image during the time period when the motion of the video is slow. Due to the slow motion, the difference between the front and back frames is not large. Therefore, a lower frame rate is acceptable to the human eye; but for the period of intense motion of the video image, the human eye cannot distinguish the details of the image, and higher requirements are placed on the continuity of the motion, so at this time Increasing the frame rate gives a better visual experience to the human eye. Therefore, if the playback frame rate of the video sequence can be changed according to the intensity of the motion of the picture, the visual experience of the human eye will be better than when the video sequence is played at a constant frame rate.

由于有这样的事实，所以设计如下两组实例。对同一段视频序列，两组实例中限制带宽相同，其他条件也相同，只是帧频不同。Due to this fact, the following two sets of examples are designed. For the same video sequence, the limited bandwidth and other conditions are the same in the two groups of instances, but the frame rate is different.

第一组：帧频恒定为10帧/秒；The first group: the frame rate is constant at 10 frames per second;

第二组：帧频依据运动剧烈程度变化，但平均帧频仍为10帧/秒。之所以平均帧频和第一组相同，是因为这样做对带宽的要求是一样的。The second group: The frame rate changes according to the intensity of the movement, but the average frame rate is still 10 frames per second. The reason why the average frame rate is the same as that of the first group is because the requirements for bandwidth are the same.

两组实例中实际被编码处理的帧号见表1所示，从中可以看出，第一组帧频恒定，而第二组帧频变化，在60帧至110帧的时间段内，帧频较高，其他时间段内帧频较低，这和前面计算的视频帧的运动剧烈程度相吻合。The frame numbers that are actually encoded in the two groups of examples are shown in Table 1. It can be seen from this that the frame rate of the first group is constant, while the frame rate of the second group changes. During the time period from 60 frames to 110 frames, the frame rate Higher, and the frame rate is lower in other time periods, which is consistent with the intensity of motion of the previously calculated video frames.

图7和图8可以看出，从第60帧到110帧，图像运动得相对剧烈，因而在第二组实验中，这一时间段中被编码的帧数相对较多，而在其他时间段被编码的帧数较少，这符合人眼的视觉特性。From Figure 7 and Figure 8, it can be seen that from the 60th frame to the 110th frame, the image motion is relatively severe, so in the second group of experiments, the number of encoded frames in this time period is relatively large, while in other time periods The number of encoded frames is less, which is in line with the visual characteristics of the human eye.

图9示出了两组实例中视频各帧清晰度统计结果，其中横轴表示视频帧的序号，纵轴表示视频帧的清晰度。图中【—■—Variable Frame Rate】表示可变帧频，【—×—Constant Frame Rate】表示恒定帧频。从图9可以看出，第二组实例中，在运动比较缓和的视频段，清晰度相对较高，而在运动剧烈的视频段，清晰度相对较低。根据本发明设计的方法计算两组实例的视频序列清晰度为：FIG. 9 shows the statistical results of the definition of video frames in two groups of examples, where the horizontal axis represents the serial number of the video frame, and the vertical axis represents the definition of the video frame. In the figure [—■—Variable Frame Rate] means variable frame rate, and [—×—Constant Frame Rate] means constant frame rate. It can be seen from FIG. 9 that in the second group of examples, the definition is relatively high in the video segment with moderate motion, while the definition is relatively low in the video segment with severe motion. According to the method designed in the present invention, the video sequence clarity of two groups of examples is calculated as:

第一组：28.2；Group 1: 28.2;

第二组：32.5。Group Two: 32.5.

视频序列清晰度越高，给人的视觉感受越好，利用本发明设计的方法得到的结果与人眼的视觉感受一致。The higher the definition of the video sequence, the better the visual experience for people, and the result obtained by using the method designed in the present invention is consistent with the visual experience of human eyes.

第三步：计算视频序列流畅度。与第二步设计的实例相同，依据本发明设计的方法计算两组实例视频序列流畅度为：Step 3: Calculate the video sequence fluency. Identical with the example of the second step design, according to the method of the present invention's design, two groups of example video sequence fluency are calculated as:

第一组：10.5；The first group: 10.5;

第二组：9.0Group 2: 9.0

视频序列流畅度越低，给人眼的视觉感受越好，此评价方法的结果与人眼的视觉感受相一致。The lower the fluency of the video sequence, the better the visual experience of the human eye, and the result of this evaluation method is consistent with the visual experience of the human eye.

第四步：计算视频序列综合评价值。根据本发明的综合评价公式，我们取W_S为0.55，W_T为0.45，两组实验的视频序列综合评价值为：Step 4: Calculate the comprehensive evaluation value of the video sequence. According to the comprehensive evaluation formula of the present invention, we take W _S as 0.55, W _T as 0.45, and the video sequence comprehensive evaluation value of two groups of experiments is:

第一组：15.6Group 1: 15.6

第二组：17.9Group 2: 17.9

综合评价值越高说明视频序列给人眼的视觉感受越好。本发明设计的评价方法与人眼的主观视觉感受相一致。The higher the comprehensive evaluation value, the better the visual experience of the video sequence for human eyes. The evaluation method designed by the present invention is consistent with the subjective visual experience of human eyes.

表1 第一组实例第二组实例 0，12，18，24，30，36，42，48，54，60，66，72，78，84，90，96，102，108，114，120，126，132，138，144，147 0，4，6，20，26，38，50，56，64，68，72，76，78，82，86，90，96，100，104，110，114，122，126，138，144，148，149 Table 1 The first set of instances Second set of instances 0, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96, 102, 108, 114, 120, 126, 132, 138, 144, 147 0, 4, 6, 20, 26, 38, 50, 56, 64, 68, 72, 76, 78, 82, 86, 90, 96, 100, 104, 110, 114, 122, 126, 138, 144, 148, 149

Claims

1. A method for evaluating video quality in combination with motion features, comprising converting the image light signal of the video image sequence of the acquired object into an electrical signal by the camera (1), and converting the video sequence from the camera by the capture card (2) Digitized and input into the computer processor, it is characterized in that it also includes the method for calculating the comprehensive evaluation index of video quality that is set in the computer processor (3) and combines the clarity and fluency of the video image organically, the method Include the following steps in turn:

1), the computer reads in the video signal from the video capture card, and saves the video capture interval T;

2), enter and complete processing such as motion compensation, conversion and encoding to the raw video frame that collects and enters computer, to compress effectively redundant video image compression processing subroutine;

3), enter the video image reconstruction subroutine that the compressed video code stream is decoded and the video sequence image is reconstructed;

4), according to two kinds of definitions of video clarity and video fluency, and the steps after using it as two measurement methods of video quality evaluation are:

First enter the subroutine for calculating the intensity of motion of the video sequence, divide each frame of video image into 16×16 image blocks, and calculate the intensity of motion of each image block and the intensity of motion of a whole frame of images;

Enter the video quality comprehensive evaluation subroutine again, according to the aforementioned degree of motion intensity, first calculate the clarity and fluency of the video sequence, and then calculate the comprehensive evaluation result;

5), starting from considering that the human eye's perception of video clarity and fluency is affected by the intensity of the video picture motion, the comprehensive evaluation result output of 4) is used to compare with the visual experience of human eyes watching from the display.

2. A method for evaluating video quality in combination with motion features according to claim 1, characterized in that it calculates the intensity of motion of the video in two steps, the first step: calculating the intensity of motion of each image block, and its calculation can be There are two methods: the first is to calculate according to the motion vector of the image block, and its calculation formula is:

{MA MA}_{k k} ((i i,, j j)) = = \sqrt{{Δx Δx}_{k k}^{22} ((i i,, j j)) + + {Δy Δy}_{k k}^{22} ((i i,, j j))}

In the formula, MA _k (i, j) represents the motion intensity of the (i, j)th image block in the kth frame video image, (Δx _k , Δy _k ) is the motion vector of the block, and the second method uses the adjacent The difference between the pixel gray levels of the two frames of images is calculated, and the calculation formula is:

{MA MA}_{k k} ((i i,, j j)) = = EXP EXP ((\frac{11}{255255 \cdot &Center Dot; {N N}_{p p}} \underset{((m m,, n no)) &Element; &Element; block block ((i i,, j j))}{Σ Σ} | | {L L}_{k k} ((m m,, n no)) - - {L L}_{k k - - 11} ((m m,, n no)) | |))

In the formula, block(i, j) represents the (i, j)th block in a frame of image, and L _k (m, n) and L _k-1 (m, n) represent the kth frame and the k-1th frame respectively The gray value of the pixel at the image (m, n), N _p represents the number of pixels in an image block; the second step: calculate the degree of motion intensity of a whole frame of video image, its calculation formula is:

{MA MA}_{k k} = = \frac{11}{{N N}_{MB MB}} \underset{i i}{Σ Σ} \underset{j j}{Σ Σ} {MA MA}_{k k} ((i i,, j j))

In the formula, MA _k represents the motion intensity of the kth frame video image, and N _MB represents the number of image blocks in one frame of image.

3. A method for evaluating video quality in combination with motion features according to claim 1, wherein when calculating the definition of video images, it is assumed that the video sequence input to the computer is collected at a constant high-speed frame rate, and the collection interval is T; in general, not all frames are coded (processed) and output; k represents the coded (processed) image frame number, and N represents the total number of frames of the coded (processed) image; kth frame and k-th The interval between one frame is FT _k T; based on the peak signal-to-noise ratio of still images, we consider the visual characteristics of the human eye and assign higher weights to the peak signal-to-noise ratio of image blocks with more intense movement. Macroblocks with smaller motions have smaller weights; first calculate the sharpness of a frame of video image, and the calculation formula is:

{SS SS}_{k k} = = 1010 \cdot &Center Dot; {log log}_{1010} \frac{255255 \cdot &Center Dot; 255255}{\frac{11}{{N N}_{MB MB}} \underset{i i,, j j}{Σ Σ} {MA MA}_{k k} ((i i,, j j)) \cdot &Center Dot; {Diff Diff}_{k k} ((i i,, j j))}

In the formula, SS _k represents the definition of a frame of video image, N _MB represents the number of image blocks in a frame of image, MA _k (i, j) represents the motion of the (i, j)th image block in the kth frame of video image severity.

{Diff Diff}_{k k} ((i i,, j j)) = = \underset{((m m,, n no)) &Element; &Element; block block ((i i,, j j))}{Σ Σ} {(({L L}_{k k,, Input input} ((m m,, n no)) - - {L L}_{k k,, Output output} ((m m,, n no))))}^{22},,

Among them, L _{k, Input} (m, n) and L _{k, Output} (m, n) are the pixel gray value at (m, n) of the original captured video and the reconstructed video after processing; then calculate the definition of a video , whose calculation formula is:

PS P.S. = = \frac{11}{N N} {Σ Σ}_{k k = = 11}^{N N} ((\frac{11}{{MA MA}_{k k}} \cdot &Center Dot; {FT FT}_{k k} \cdot &Center Dot; {SS SS}_{k k}))

In the formula, PS represents the definition of a video, the larger the PS is, the clearer the video image is, and MA _k represents the motion intensity of the kth frame video image.

4. A method for evaluating video quality in combination with motion features according to claim 1, wherein when calculating the fluency of a section of video, the video fluency is calculated according to this visual characteristic of the human eye, and its calculation formula is:

PT PT = = \frac{11}{N N} {Σ Σ}_{k k = = 11}^{N N} (({MA MA}_{k k} \cdot \cdot {FT FT}_{k k}))

In the formula, PT represents the fluency of a section (N frames) of video, the smaller the PT, the smoother the video, MA _k represents the motion intensity of the kth frame video image, and FT _k is the difference between the kth frame and the k-1th frame during encoding. The number of skipped frames between.

5. A video quality evaluation method combined with motion features according to claim 1, wherein when calculating the comprehensive evaluation index value of a section of video image, the calculation formula is:

P P = = {W W}_{S S} \cdot &Center Dot; PS P.S. + + {W W}_{T T} \cdot &Center Dot; \frac{11}{PT PT}

In the formula, P represents the comprehensive evaluation value of a video image quality, PS is sharpness, PT is fluency, _WS and W _T are the weights of sharpness and fluency in the comprehensive evaluation value.